 # Adding A New Aggregator Function To AFC

An aggregator is a (typically statistical) function that computes a single value from a range of values. `SUM()` is the prime example. Here, we shall look at how I implemented `VAR()`. So let’s assume `VAR()` has not been implemented yet.

Note: You must have read the topics on adding primitive functions and high-level functions to understand this section.

## Docs and Tests

As always, we start with the announcement in the release notes, and the tests:

 A B C D E F G H I J K L 1 Expected=IF(Q1,"Expected","FAILED!") Actual Inputs # of Inputs Name Highlight 149 0.5 0.5=VAR(C149,D149) 1 2 2 VAR (does not support blanks!) VAR 150 0.5 0.5=VAR(C150,D150) 4 5 2 151 152 5.8 5.8=VAR(C152:E152,G152,I152) 1 2 3 100 5 200 7 7 153 2.5 2.5=VAR(C153:E153,G153,I153) 6 7 8 80 9 10 7

## Parsing

Making `VAR()` known to the parser is much like what we did for `ABS()`. Aggregators, however, need to accept range arguments, not just simple values. For the standard aggregators which treat all their arguments as a single range, we can mimick the definition of `SUM()` as follows:

``|	"VAR" aggN( Function.VAR )``

## Mathematical Definition

The function `VAR()` is defined in the Excel help file as follows:

$\frac{\sum \left(x-\overline{x}{\right)}^{2}}{n-1}$

Clearly, what we have here is a sum over a function on all of the elements in the range. How can we rewrite this using Excel functions? We cannot. But AFC defines a couple of functions of its own that help.

## Folding Functions

For a quick example, let’s look at how the rewrite rule for `SUM()` is defined in AFC:

``````def sum = fold/reduce with s = 0 each xi as s = s + xi end
rewrite sum( xs* ) = apply sum to list xs``````

This means that the result, `s`, is initially 0. Then, for every `xi` in the range `xs`, we simply add `xi` to `s`. In the argument specification, the `*` indicates that `xs` represents all remaining arguments as a list (much like `int... a` in Java).

So the folding function iterates over a list of arguments. For every value in the list, it executes a given computation involving a so-called accumulator value (called `r` here) and the current value from the list (called `xi` here). The result becomes the new accumulator value. The final result is the accumulator value at the end of the list.

#### Special First Argument

The version of `fold` I used here, `fold/reduce`, is simply `fold` with a hint that it is OK to directly use the first argument as the starting value, instead of the specified value 0. This is a performance optimization.

## Rewriting It

Let’s recap the mathematical definition:

$\frac{\sum \left({x}_{i}-\overline{x}{\right)}^{2}}{n-1}$

where the average is, of course, defined as:

$\overline{x}=\frac{\sum {x}_{i}}{n}$

We could formulate this as a rewrite rule directly using a `fold` as in `SUM` above, with the help of `COUNT` and `AVERAGE`. However, passing over the list of values only once is generally more efficient (especially once AFC supports cursor-style iteration over large repeating sections). A bit of algebra shows that the above is the same as:

$\frac{\sum {x}^{2}-\left(\sum x{\right)}^{2}/n}{n-1}$

So we need both the sum and the sum of squares of the list of values. We do this using two parallel accumulators:

``````def var =
fold with
s = 0,
ss = 0
each xi as
s = s + xi,
ss = ss + xi * xi
with count n into
(ss - s*s/n) / (n - 1)
end
rewrite var( xs* ) = apply var to list xs``````