AFC - Abacus Formula Compiler for Java

Locale and Time Zone

There are a few spreadsheet functions that need to take a locale and time zone into account. When compiled engines are run, the default is to use the current settings of the JVM. Server-side applications, however, might want to override those on a per-session basis. And when compiling engines, the compiler has to know about the regional settings used by the creator of the sheet.

Background

Why do we need locale information? For example, the TEXT(num,fmt) function must know the names of the days of the week when it sees dddd in the format string, or the decimal separator. Conversely, the VALUE(text) function needs to know the number and date format settings to correctly parse text into values.

And why a time zone? Because Excel and Java have different ideas about date/time values.

There are three different AFC components that need locale-specific configuration:

Computation
The computation configuration controls how compiled engines compute, for example, the result of the TEXT() function.
Compiler
The compiler configuration controls how constant text values in the compiled spreadsheet are converted to numeric or date values during compilation, if this should prove necessary (the inverse, conversion to text, is never performed at compile-time). So it should reflect the locale used by the sheet author.
Spreadsheet Saver
The spreadsheet saver configuration controls how constant date values in the internal spreadsheet model are converted to local Excel dates.

An example is the compilation of

A B
1 =VALUE(B1)
(ValueResult)
1,234.56
(ValueInput)

If we don’t bind ValueInput to an input method, this should clearly be done using the regional settings used by the sheet author – the compiler configuration. If we do bind ValueInput to an input method, the conversion in ValueResult becomes a runtime call and so uses the computation configuration, as the value fed to ValueInput at runtime is expected to be in the runtime number format.

Warning!

What if we mix the two? Consider

A B C
2 =VALUE(B2&C2)
(MixedResult)
0. 6
(MixedInput)

where only MixedInput is bound to an input method. This will fail unless the number format of the computation also has . as decimal separator. The reason is AFC will compile MixedResult as a runtime invocation of VALUE, which always uses the computation’s number format.

Computation Configuration

How do you configure what locale and time zone a compiled computation is to use? The easiest way is to simply override the system defaults:

Locale.setDefault( Locale.GERMAN );
assertComputation( "37287,4211", (MyFactory) _engine.getComputationFactory() );
Locale.setDefault( Locale.ENGLISH );
assertComputation( "37287.4211", (MyFactory) _engine.getComputationFactory() );

But in a multi-threaded, internationally used server application, this will not do. To define a locale and time zone for only a specific set of computations, you request from AFC a computation factory configured accordingly. Here’s how:

Computation.Config config = new Computation.Config( _locale );
MyFactory factory = (MyFactory) _engine.getComputationFactory( config );

Actual use of the factory and the computations created by it remains unchanged:

MyComputation computation = _factory.newComputation( new MyInputs() );
String actual = computation.formatted();

but still reacts to the configuration:

assertComputation( "37287,4211", _engine, Locale.GERMAN );
assertComputation( "37287.4211", _engine, Locale.ENGLISH );

The class Computation.Config currently supports locale and time zone information, but could be extended by other environment information we might need.

Why this design?

Why do we not simply configure a computation, as in

outputs = factory.newComputation( inputs, new Computation.Config( locale ));

Well, because it would pollute the factory interface with an element specific to AFC: Computation.Config. This interface, however, is part of a strategy pattern instance which is not specific to AFC. In a similar vein, we believe that locale and time zone information ought to be handled by the underlying user and session management (technical domain), not in the application interfaces (problem domain).

Then why not

factory.setConfig( new Computation.Config( locale ));
outputs = factory.newComputation( inputs );

Because then the factory would be stateful, which could lead to all sorts of side effects and problems with thread-safety.

Compiler Configuration

Like with the runtime configuration, you could change the compile-time settings by changing the JRE’s default locale. Again, this is not recommended. Let’s look again at the example mentioned above to see how it’s done properly:

A B
1 =VALUE(B1)
(ValueResult)
1,234.56
(ValueInput)

In the following example, we deliberately switch to a German locale which has a comma as the decimal separator, and then test compiling a sheet authored in the US locale:

Locale oldLocale = Locale.getDefault();
Locale.setDefault( Locale.GERMANY );
try {
  DecimalFormat decimalFormat = ((DecimalFormat) NumberFormat.getInstance());
  assertEquals( ',', decimalFormat.getDecimalFormatSymbols().getDecimalSeparator() );

  EngineBuilder builder = SpreadsheetCompiler.newEngineBuilder();
  builder.loadSpreadsheet( file );
  builder.getCompileTimeConfig().locale = Locale.US;
  builder.setFactoryClass( ValueFactory.class );
  builder.getByNameBinder().outputs().bindAllMethodsToNamedCells();
  Engine engine = builder.compile();
  ValueFactory factory = (ValueFactory) engine.getComputationFactory();
  ValueComputation computation = factory.newComputation( null );
  assertEquals( 1234.56, computation.valueResult(), 0.00001 );
}
finally {
  Locale.setDefault( oldLocale );
}

Runtime Variant

If, however, we bind ValueInput, we need to pass a string in the runtime format (which we switched to German):

public String valueInput() { return "6.543,21"; }

and get it back properly:

Locale oldLocale = Locale.getDefault();
Locale.setDefault( Locale.GERMANY );
try {
  DecimalFormat decimalFormat = ((DecimalFormat) NumberFormat.getInstance());
  assertEquals( ',', decimalFormat.getDecimalFormatSymbols().getDecimalSeparator() );

  EngineBuilder builder = SpreadsheetCompiler.newEngineBuilder();
  builder.loadSpreadsheet( file );
  builder.setFactoryClass( ValueFactory.class );
  builder.getByNameBinder().outputs().bindAllMethodsToNamedCells();
  builder.getByNameBinder().inputs().bindAllMethodsToNamedCells();
  Engine engine = builder.compile();
  ValueFactory factory = (ValueFactory) engine.getComputationFactory();
  ValueComputation computation = factory.newComputation( new ValueInputs() );
  assertEquals( 6543.21, computation.valueResult(), 0.00001 );
}
finally {
  Locale.setDefault( oldLocale );
}

Why not use the saved info?

Excel saves the country code of the regional settings that were active when a sheet was saved. Why does AFC not use this?

  • The settings might have been overridden in the Windows regional settings dialog.
  • The decimal point and thousands separator might have been overridden in Excel’s configuration.

Spreadsheet Saver Configuration

When saving generated spreadsheets, AFC needs to know which time zone to use to convert java.util.Date constants to Excel’s internal local time format. Let’s consider an example:

TimeZone gmt2 = TimeZone.getTimeZone( "GMT+2:00" );
Calendar cal = Calendar.getInstance( gmt2 );
cal.clear();
cal.set( 1970, 6, 13, 12, 13 );
Date date = cal.getTime();

SpreadsheetBuilder b = SpreadsheetCompiler.newSpreadsheetBuilder();
b.newCell( b.cst( date ) );
b.nameCell( "result" );

We need to tell the saver that the embedded java.util.Date constant is to be taken relative to GMT+2. Here’s how:

SpreadsheetSaver.Config cfg = new SpreadsheetSaver.Config();
cfg.spreadsheet = b.getSpreadsheet();
cfg.typeExtension = getSpreadsheetExtension(); // .xls or .ods
cfg.outputStream = outputStream;
cfg.timeZone = gmt2;
SpreadsheetCompiler.newSpreadsheetSaver( cfg ).save();

When we load, compile and use this spreadsheet again, but using a different time zone in the computation configuration, the distance in hours between the two time zones can be noticed:

Spreadsheet loaded = SpreadsheetCompiler.loadSpreadsheet( getSpreadsheetExtension(), inputStream );
EngineBuilder eb = SpreadsheetCompiler.newEngineBuilder();
eb.setSpreadsheet( loaded );
eb.setInputClass( Object.class );
eb.setOutputClass( DateComputation.class );
eb.bindAllByName();
SaveableEngine e = eb.compile();

TimeZone gmt6 = TimeZone.getTimeZone( "GMT+6:00" );
ComputationFactory f = e.getComputationFactory( new Computation.Config( gmt6 ) );
DateComputation c = (DateComputation) f.newComputation( null );
Date result = c.result();
assertEquals( fourHoursInMS, date.getTime() - result.getTime() );

This demonstrates the effect of local Excel dates. The constant was created as 1970-06-13 12:13 (GMT+2), saved as just 1970-06-13 12:13 irrespective of time zone, then reinterpreted by the computation as 1970-06-13 12:13 (GMT+6). Hence the difference of 4 hours between the original and the final date.