AFC - Abacus Formula Compiler for Java

The Basics

AFC makes it possible to use computations defined in spreadsheet files within Java applications. AFC reads the spreadsheet files and compiles Java-based formula evaluation engines from them. Such an engine computes output values given a set of input values. For example, compute the value of the cell B4 given the values of the cells B1 and B2 as inputs. The original spreadsheet files can be created, for example, in Microsoft Excel or OpenOffice Calc.

Line Item Rebate Revisited

AFC lets us solve the problem of the customizable line item rebate very elegantly. We can enable the distributors – and also the more savvy customers – to customize the rebate computation in Microsoft Excel, or OpenOffice Calc.

Here’s the default rebate computation from the example, but now expressed as a spreadsheet:

A B
1 Customer Rebate 3%
2 Article Rebate 6%
3
4 Rebate 6%
=MAX(B1:B2)

And here’s one variant some specific customer might want to use (the best category gets combined rebates):

A B
1 Customer Rebate 3%
2 Article Rebate 6%
3 Customer Category 5
4
5 Rebate 6%
=IF(B3=1.0,SUM(B1:B2),MAX(B1:B2))

AFC’s job is to turn these sheets into valid rebate computation strategy classes for our application.

What AFC Needs To Know

AFC builds a compiled Java class for each computation by analyzing the corresponding spreadsheet, just like the Java compiler does for our hand-written Java source code. So, in a way, a spreadsheet is source code to AFC.

And, just like the Java compiler, AFC needs to know about

  • RebateInputs, the input interface, in order to resolve references to input values,
  • RebateComputation, the output interface (or, in this case, abstract class), in order to know what the API of the generated computation should be, and
  • RebateComputationFactory, the computation factory interface, so it can generate a compatible factory for us.

In addition, AFC must be able to determine

  • where the spreadsheet formulas reference input values, and
  • which formulas should serve as which output method implementations.

Convention Over Configuration

While all these elements can be specified to AFC in great detail, it is far simpler to follow a few natural conventions. If we do, AFC can figure out almost all of the details by itself.

Factory for input/output

If our strategy interface has only a single method with exactly one parameter, then AFC infers the input and output interfaces from the method signature automatically. In our line item rebate example, the factory interface satisfies these rules:

public static abstract class RebateComputationFactory
{
  public abstract RebateComputation newInstance( RebateInputs _inputs );
}
Cell names for references

If our spreadsheet contains user-defined cell names (the names you can enter in the box in the upper right of the formula toolbar in Excel), then AFC automatically uses them to resolve input and output references, as follows:

  • For every abstract method on the output interface, AFC looks for a cell named like the method, which is used to implement the method. That is, the cell’s value or formula is compiled as the body of the output method. If there is no such cell, an error is raised.
  • For every non-abstract output method, AFC does the same as above, only it is no error of there is no matching cell.
  • For all remaining cell names, AFC looks for a matching method on the input interface. If found, it binds the cell to it. So when the computation accesses the cell, it calls the input method instead of using the value or formula specified in the spreadsheet.
  • When matching a cell name XY to a method, AFC always considers both xy() and getXY() as method names. The match is not case-sensitive.
  • If you wish, you can tell AFC to raise an error if any unbound named cells remain after this. Alternatively, you can query the list of unbound cell names to do with as you please (perhaps just alerting users to the unbound cells).

There is also a variant where only cell names with a given prefix are considered, for example only cell names starting with “I_” for inputs and “O_” for outputs.

More precisely, given the methods

public abstract double getGetter();
public abstract double plain();
public abstract double both();
public abstract double getBoth();
public abstract double isAbstract();
public double isImplemented() { return 0.0; }
public final double isFinal() { return 0.0; }
public static double isStatic() { return 0.0; }
public abstract double hasParameters( int _p );

the following holds for binding input values:

bindsTo( "getter", Inputs.class.getMethod( "getGetter" ) );
binds( "getGetter" );
binds( "plain" );
doesNotBind( "getPlain" ); // there is no method "getPlain()" or "getGetPlain()"
binds( "both" ); // precise name is used if found
binds( "getBoth" ); // ditto
doesNotBind( "unbound" );
binds( "isAbstract" );
binds( "isImplemented" );
binds( "isFinal" );
binds( "isStatic" );
doesNotBind( "hasParameters" );

When using the prefix P_, we have:

doesNotBind( "getter" );
doesNotBind( "getGetter" );
doesNotBind( "plain" );
doesNotBind( "getPlain" );
bindsTo( "P_getter", Inputs.class.getMethod( "getGetter" ) );
bindsTo( "P_getGetter", Inputs.class.getMethod( "getGetter" ) );
bindsTo( "P_plain", Inputs.class.getMethod( "plain" ) );
doesNotBind( "P_getPlain" );
bindsTo( "P_both", Inputs.class.getMethod( "both" ) );
bindsTo( "P_getBoth", Inputs.class.getMethod( "getBoth" ) );

The same holds for binding output values, except that final and static methods do not bind:

bindsTo( "getter", Outputs.class.getMethod( "getGetter" ) );
binds( "getGetter" );
binds( "plain" );
doesNotBind( "getPlain" );
binds( "both" );
binds( "getBoth" );
doesNotBind( "unbound" );
binds( "isAbstract" );
binds( "isImplemented" );
doesNotBind( "isFinal" );
doesNotBind( "isStatic" );
doesNotBind( "hasParameters" );

When using the prefix P_, we again have:

doesNotBind( "getter" );
doesNotBind( "getGetter" );
doesNotBind( "plain" );
doesNotBind( "getPlain" );
bindsTo( "P_getter", Outputs.class.getMethod( "getGetter" ) );
bindsTo( "P_getGetter", Outputs.class.getMethod( "getGetter" ) );
bindsTo( "P_plain", Outputs.class.getMethod( "plain" ) );
doesNotBind( "P_getPlain" );
bindsTo( "P_both", Outputs.class.getMethod( "both" ) );
bindsTo( "P_getBoth", Outputs.class.getMethod( "getBoth" ) );
Row titles for cell names

For many users, entering an managing cell names is not straightforward, at least in Microsoft Excel. Neither are the names easily visible when looking at a spreadsheet.

AFC therefore can use row titles instead of cell names to resolve references. Row titles are simply string values in column A of a spreadsheet, with the cell they name right next to it in column B. Spaces and other characters not legal in cell names are filtered out. If a spreadsheet contains no named cells, AFC automatically assumes row titles.

Example

Our example from above satisfies all of these criteria:

A B
1 Customer Rebate 3%
2 Article Rebate 6%
3
4 Rebate 6%
=MAX(B1:B2)

AFC will automatically infer the following bindings:

  • CustomerRebate identifies cell B1, which is mapped to the input method double getCustomerRebate(),
  • ArticleRebate identifies cell B2, which is mapped to the input method double getArticleRebate(), and
  • Rebate identifies cell B3, which implements the output method double getRebate().

Compiling The Factory

Since we follow AFC’s conventions, the final setup of AFC’s spreadsheet compiler becomes very simple:

private RebateComputationFactory compileFactoryFromSpreadsheet( File _sheetFile ) throws Exception
{
  EngineBuilder builder = SpreadsheetCompiler.newEngineBuilder();
  builder.loadSpreadsheet( _sheetFile );
  builder.setFactoryClass( RebateComputationFactory.class );
  builder.bindAllByName();
  builder.failIfByNameBindingLeftNamedCellsUnbound();

  Engine engine = builder.compile();
  return (RebateComputationFactory) engine.getComputationFactory();
}

AFC compiles a complete factory and strategy implementation to our specs, so using it is child’s play now:

public void testAFC() throws Exception
{
  LineItem item = new StrategyLineItem();
  RebateComputation.factory = compileFactoryFromSpreadsheet( getCustomSheetFile() );
  double rebate = item.computeRebate();
  assertEquals( 0.15, rebate, 0.00001 );
}

Generated Engine

The compiled engine, when decompiled to Java source code, looks like this:

package org.formulacompiler.gen;
import org.formulacompiler.runtime.Computation;
import org.formulacompiler.runtime.internal.Environment;
import org.formulacompiler.runtime.internal.RuntimeDouble_v2;
import org.formulacompiler.tutorials.Basics;

final class $Root extends Basics.RebateComputation implements Computation
{
    private final Basics.RebateInputs $inputs;
    final Environment $environment;
    
    $Root(Basics.RebateInputs rebateinputs, Environment environment) {
        super(rebateinputs);
        $environment = environment;
        $inputs = rebateinputs;
    }
    
    final double get$0() {
        return (get$1() == 1.0 ? get$2() + get$3()
                : RuntimeDouble_v2.max(get$2(), get$3()));
    }
    
    public final double getRebate() {
        return get$0();
    }
    
    final double get$1() {
        return (double) $inputs.getCustomerCategory();
    }
    
    final double get$2() {
        return $inputs.getCustomerRebate();
    }
    
    final double get$3() {
        return $inputs.getArticleRebate();
    }
}

Outlook

This example showed a simple but complete case study of how AFC can be incorporated into a typical line-of-business application. AFC is quite a bit more powerful than just this, though. Read on.

Back: Tutorial Index