AFC - Abacus Formula Compiler for Java

Repeating Sections

AFC allows you to define repeating sections in a spreadsheet. Such sections are different from ordinary ranges in Excel in that the width or height of the section varies with each use of the computation.

Sample Spreadsheet

For example, consider a hypothetical customer rating computation. We want users to be able to define a spreadsheet that computes the rating that we run for each customer in turn. For each customer, we feed the totals of all his orders of the last three months to the computation. One particular customer may have placed 5 orders in the last three months, another one 200.

Let me show you a very simple spreadsheet that computes such a customer rating. Note that in the depicted spreadsheet

  • the colored area is a named range whose name is given just below the sheet, and
  • the blue names in parentheses are cell names.
A B C D E F G
1 Sections Inputs
2 OrdersForLastNDays(30) OrderTotal
3 $450.00
(OrderTotal)
4 $1,320.00
5 $1,540.00
6 $60.00
7 $5,870.00
8
9 Intermediates
10 total $9,240.00
=SUM(B3:B7)
11
12 Outputs
13 Rating Good
=HLOOKUP(B10,C13:G14,2.0)
(Rating)
$0.00 $2,000.00 $5,000.00 $7,500.00 $10,000.00
14 Abysmal Poor Average Good Excellent

B3:B7 (OrdersForLastThreeMonths)

The row range 3:7 will be our repeating section with variable height. Cell B10, the Total, sums the five rows 3..7. For every actual computation, AFC will shrink or extend the range within the sum in B10 to match the number of order totals we actually passed in.

Spreadsheet Rules

The fact that my original spreadsheet has exactly five order total rows is irrelevant to AFC. What matters is that

  • when we define the repeating section over the rows 3..7,
  • the sum in B10 covers the entire height of the variable section, and
  • the formula in B10 uses the range function SUM, rather than B3+...+B7.

Generally speaking, if you reference cells of a repeating section from outside the section, you’ll have to use a spreadsheet function that takes a range as an argument. This range must cover the entire variable extent of the section (height or width).

So we could just as well have used ten, or only two example rows. (In fact, we can get away with just one example row. I don’t recommend this, though, because you might then forget to use SUM over the rows.)

In any case, AFC uses only the first row as a template for making the n rows later on, at runtime. This will be important later when I show you a more complex spreadsheet.

Hooking It Up

Let’s hook up the above spreadsheet with the customer rating application. First of all, we need an interface to the customer data. This interface must let AFC query the order totals. A simple approach would be to just return an array of order totals. AFC needs a little more structure than this, though.

When you define a repeating section with AFC, the section is treated like an embedded sub-sheet within the main sheet. Like a master/detail form really. This sub-sheet works just like the main sheet, so it must have an input interface to get values from. (As you’ll see later, one advantage of having a proper interface for section elements is it becomes very straightforward to add new input values like the order date.)

Here’s the interface for the orders sub-sheet we are going to use:

public static interface OrderData
{
  public double total();
}

The main customer interface must therefore let AFC query the set of orders it should process for this customer. The simplest way is to use an array:

public static interface CustomerData
{
  public OrderData[] ordersForLastNDays( int _days );
}

You can also use an Iterable or an Iterator, or one of their descendants. Like so:

public Iterable<OrderData> ordersForLastNDaysIterable( int _days );
public Iterator<OrderData> ordersForLastNDaysIterator( int _days );

We now need to tell AFC that the order totals range of cells in the spreadsheet should be considered a vertically repeating section, which gets its data from the call shown above. In addition, we must tell AFC the precise input interface to use for the orders, namely OrderData. Here’s how to do this (given that the name OrdersForLastThreeMonths is defined in the spreadsheet as B3:B7):

Section binder = _builder.getRootBinder();
Range range = sheet.getRange( "OrdersForLastThreeMonths" );
Method mtd = CustomerData.class.getMethod( "ordersForLastNDays", Integer.TYPE );
CallFrame call = _builder.newCallFrame( mtd, 90 ); // last 3 months is 90 days back
Orientation orient = Orientation.VERTICAL;
Class input = OrderData.class;

Section orders = binder.defineRepeatingSection( range, orient, call, input, null, null );

As I said above, a repeating section is like a sub-sheet within the main-sheet. Thus, the return value of this method is again a Section, just like the original binder for the main sheet. We use it to bind the cells in the first template row of the section to input methods on the order interface. Like so (given that the name OrderTotal is defined in the spreadsheet as B3):

Cell totalCell = sheet.getCell( "OrderTotal" );
Method totalMethod = OrderData.class.getMethod( "total" );
orders.defineInputCell( totalCell, totalMethod );

Binding the outputs is straightforward, as we are only binding global values, not values within the repeating section:

Cell ratingCell = sheet.getCell( "Rating" );
Method ratingMethod = CustomerRating.class.getMethod( "rating" );
binder.defineOutputCell( ratingCell, ratingMethod );
Internal Caching

Using repeating sections introduces a certain amount of internal caching to the computation. Including Resettable in your output interface allows you to control this cache.

Note

Given the return type OrderData[], AFC could infer the input interface for the section by itself. Unfortunately, this is not possible when the return type is a Collection or an Iterator. Even Collection<OrderData> does not help because the generics annotations are erased at runtime. AFC therefore generally mandates that you pass it the interface type to use.

Generated Code

AFC generates a separate engine class for each repeating section:

package org.formulacompiler.gen;
import org.formulacompiler.runtime.Computation;
import org.formulacompiler.tutorials.CustomerRatingWithOrders;

final class $Sect0 implements Computation
{
    private final CustomerRatingWithOrders.OrderData $inputs;
    final $Root $parent;
    final $Root $root;
    
    $Sect0(CustomerRatingWithOrders.OrderData orderdata, $Root root) {
        $parent = root;
        $root = root;
        $inputs = orderdata;
    }
    
    final double get$0() {
        return $inputs.total();
    }
}

Its instances are built in the main engine when the repeating section is first accessed. You can see this in the method get$Sect0() below:

package org.formulacompiler.gen;
import org.formulacompiler.runtime.Computation;
import org.formulacompiler.runtime.FormulaException;
import org.formulacompiler.runtime.internal.Environment;
import org.formulacompiler.runtime.internal.RuntimeDouble_v2;
import org.formulacompiler.tutorials.CustomerRatingWithOrders;

final class $Root
    implements Computation, CustomerRatingWithOrders.CustomerRating
{
    private final CustomerRatingWithOrders.CustomerData $inputs;
    final Environment $environment;
    private double[] $constarr$0;
    private $Sect0[] get$Sect0;
    
    $Root(CustomerRatingWithOrders.CustomerData customerdata,
          Environment environment) {
        $environment = environment;
        $inputs = customerdata;
    }
    
    final String get$0() {
        return $idx$0(RuntimeDouble_v2.fun_MATCH_Ascending(get$1(),
                                                           $constarr$0()) - 1);
    }
    
    public final String rating() {
        return get$0();
    }
    
    final double[] $constarr$0() {
        if ($constarr$0 == null)
            $constarr$0
                = new double[] { 0.0, 2000.0, 5000.0, 7500.0, 10000.0 };
        return $constarr$0;
    }
    
    final double get$1() {
        double d = 0.0;
        $Sect0[] sect0s = get$Sect0();
        int i = sect0s.length;
        for (int i_0_ = 0; i_0_ < i; i_0_++) {
            $Sect0 sect0 = sect0s[i_0_];
            d += sect0.get$0();
        }
        return d;
    }
    
    final String $idx$0(int i) {
        switch (i) {
        case 0:
            return "Abysmal";
        case 1:
            return "Poor";
        case 2:
            return "Average";
        case 3:
            return "Good";
        case 4:
            return "Excellent";
        default:
            throw new FormulaException
                      ("#VALUE/REF! because index is out of range in INDEX");
        }
    }
    
    final $Sect0[] get$Sect0() {
        if (get$Sect0 == null) {
            CustomerRatingWithOrders.OrderData[] orderdatas
                = $inputs.ordersForLastNDays(90);
            if (orderdatas != null) {
                int i = orderdatas.length;
                $Sect0[] sect0s = new $Sect0[i];
                for (int i_1_ = 0; i_1_ < i; i_1_++)
                    sect0s[i_1_] = new $Sect0(orderdatas[i_1_], this);
                get$Sect0 = sect0s;
            } else
                get$Sect0 = new $Sect0[0];
        }
        return get$Sect0;
    }
}

Intermediate Values In A Repeating Section

Let’s extend the order totals example a bit. In the sum of the order totals, we want to give less weight to older values. To do this, we add the order date to the order interface:

public static interface OrderData
{
  public double total();
  public Date date();
}

We now want our spreadsheet to compute the age of an order in days, and from that to compute the weight to give to the order’s total. To do this, we compute the difference between the current date and the order date, and then linearly reduce the weight of olders orders towards 0 at the age of 90 days. The prototype row 3 shows this:

A B C D E F G
1 Sections Inputs Intermediates
2 OrdersForLastNDays(90) OrderTotal OrderDate days back weight value
3 $450.00
(OrderTotal)
4/13/06
=B$11-89.0
(OrderDate)
89
=B$11-C3
1%
=(90.0-D3)/90.0
$5.00
=B3*E3
4 $1,320.00 5/2/06
=B$11-70.0
70
=B$11-C4
22%
=(90.0-D4)/90.0
$293.33
=B4*E4
5 $1,540.00 5/27/06
=B$11-45.0
45
=B$11-C5
50%
=(90.0-D5)/90.0
$770.00
=B5*E5
6 $60.00 6/11/06
=B$11-30.0
30
=B$11-C6
67%
=(90.0-D6)/90.0
$40.00
=B6*E6
7 $5,870.00 7/8/06
=B$11-3.0
3
=B$11-C7
97%
=(90.0-D7)/90.0
$5,674.33
=B7*E7
8
9 Intermediates
10 total value $6,782.67
=SUM(F3:F7)
11 current date 7/11/06
=TODAY()
12
13 Outputs
14 Rating 3
=MATCH(B10,C14:G14)
(Rating)
$0.00 $2,000.00 $5,000.00 $7,500.00 $10,000.00

B3:F7 (OrdersForLastThreeMonths)

To consider:

  • The order date is an input value. But to provide meaningful sample values for it, I have used a formula that computes them relative to the current date.
  • We now have intermediate values within the repeating section. Columns B and C are inputs, but columns D, E, F are computed within the spreadsheet.
  • The final aggregation in B10 now sums the computed section values in F3:F7.
  • The computed row value in column D references a cell outside of the repeating section, namely the current date in B11.

AFC handles all of this correctly. You do, however, have to follow a few rules:

  • When referencing within the section, only reference within the same row. You cannot reference a value from a sibling row. In particular, you cannot do running totals.
  • As stated above, AFC only looks at the first row of the section (row 3 here), which it then uses as a template for all others. You must yourself take care that your other example rows (rows 4 through 7 here) have formulas of a similiar structure as the template row.

The only remaining thing is to bind the order date in a fashion similar to the order total (assuming the name OrderDate is defined in the spreadsheet as C3):

Cell totalCell = sheet.getCell( "OrderTotal" );
orders.defineInputCell( totalCell, OrderData.class.getMethod( "total" ) );

Cell dateCell = sheet.getCell( "OrderDate" );
orders.defineInputCell( dateCell, "date" ); // shorter form