Technical notes

  • Internationalization and Unicode
  • Accounting Date Formats
  • Memory Allocation and Sheet Caching
  • Java Dates and Fractions of a Second
  • Writing Dates and Java Timezone Adjustment
  • Alternative Methods of Logging
  • Hyperlinks
  • Disabling Features
  • Internationalization and Unicode

    ExcelRead supports the full unicode character set for internationalization. However there can be issues when displaying these - the default behaviour when writing out a unicode String to a stream is to encode the string using UTF-8 encoding. For genuine unicode characters this results in a loss of information.

    The easiest way around this when writing out unicode strings is to wrap an OutputStreamWriter (with the appropriate encoding) around the normal output stream.

    The demonstration program, CSV.java, illustrates this, wrapping an encoded OutputStreamWriter around System.out. When invoked with default behaviour it encodes using UTF-8; when invoked with the -unicode option it encodes the stream as BigEndian Unicode.

    eg. to write strings out using standard UTF-8 encoding, invoke the application as follows:

    java -jar xlrd.jar uc.xls

    To produce fully unicode output, invoke ExcelRead in the following manner:

    java -jar xlrd.jar -unicode uc.xls

    Accounting Date Formats

    As of version 1.4 the excel cell formatting is preserved. However if the numerical cell format is an accounting format which incorporates a currency sign (such as sterling or yen) then this is not represented within the standard ASCII character set. The currency symbol will be included as part of the Unicode character set, so the necessary steps should be taken to output the result of getContents() in a unicode aware environment. For the CSV demo program, this simply involves invoking it with the -unicode switch on the command line.

    Memory Allocation and Sheet Caching

    For very large files, it was found that reading in the entire workbook led to horrendous performance. As of version 1.6, it was therefore decided to read in a sheet at a time. Every call to getSheet() frees up all the objects currently allocated in the current sheet and then reads in all the data from the next sheet. This alleviated the problems which previously occurred, but it does however assume that the giant size workbooks are spread over several sheets - if not, performance problems and OutOfMemoryExceptions may still occur.

    A consequence of this is that if you are using the API to constantly flick between sheets in a large workbook, then this will result in adverse performance because every call to getSheet is causing ExcelRead to re-read and re-deduce all the data. It is recommended that ,if possible, the client application should do all their processing necessary on one sheet before progressing to the next.

    The method getSheets(), which returns an array of all sheets, will still attempt to hold all sheets in memory at the same time. Use of this method for very large spreadsheets is not recommended.

    Date Format and Fractions of a Second

    If a date format is specified in Excel eg. as mm:ss.00 ie. in order to display hundredths of a second, then the corresponding java data format is mm:ss.SS. On Unix/Linux systems this renders the hundredths of a second as expected. Unfortunately on Windows/NT systems the format is unpredictable. In order to get around this, any fractional representation in Excel defaults to mm:ss.SSS in order for the correct display on NT systems. Thus a time of 01:10.05 will be displayed as 01:10.050.

    This is merely the default behaviour - the API is of course designed to permit reformatting of date objects using DateFormat objects of the developer's choice.

    Writing Dates to Excel

    When using dates (from the util package), Java will automatically perform any timezone adjustments for you. Thus if you create a date using the Calendar object in the EST, Java will store this in the GMT equivalent (ie. EST + 5hrs) and also store the offset ie. -5hrs. When java.util.Dates are passed to the JExcelApi, these have to be converted into an Excel date value before it can be stored. The way this is handled is that it obtains the GMT value from the date, and then obtains the timezone offset FOR THE DEFAULT TIMEZONE; thus in the earlier example it will obtain GMT, take the EST offset, add it on, and convert this to an Excel date value. Normally this behaviour is sufficient, and the client application is unaware that timezone information has been incorporated.

    However, in certain circumstances, this timezone feature can cause unpredictable behaviour. For example, if you are entering the time it takes Donovan Bailey to run 100m, you might store this in a Date object 9830ms. If you happen to be in the EST timezone, you will not want timezone information to be taken into consideration (this would give a time of minus five hours). To get around this, use one of the GMT overloaded methods on jxl.write.DateTime - this tells JExcelApi that the time passed in is in GMT, so any local timezone offsets are ignored. If you are creating a Date object from a java.util.Calendar and wish to use this overload, make sure you have set the timezone in the Calendar object to "GMT".

    When reading dates this timezone adjustment can also prove problematic, as the timezone adjustment can make the dates look as if they are out by one day.

    If you are located in EST zone (which is GMT - 5 hours) java will subtract 5 hours from the date - so the Date object instead of being eg. 00:00 20/08/2003 becomes 19:00 19/08/2003. Because java recognizes you only want a date and not a date time, it truncates the hours/minutes/seconds and presents 19/08/2003 - so it appears that the day is one day less than was stored in Excel, whereas it is really only a few hours (the timezone offset) less.

    The easiest way to work around when reading and display dates (and the method used internally by the getContents() method of a jxl.DateCell) is to force the timezone of the date format as follows:
    TimeZone gmtZone = TimeZone.getTimeZone("GMT");
    SimpleDateFormat format = new SimpleDateFormat("dd MMM yyyy");
    format.setTimeZone(gmtZone);

    DateCell dateCell = ....
    String dateString = format.format(dateCell.getDate());

    Writing Dates to Excel

    As of JExcelApi v2.3 writing of formulas is supported. However, please note that when passing in the formula to be written, JExcelApi only attempts to generate the Excel tokens - it only does the most rudimentary syntactic checking of the formula. This means that if an application passes an erroneous formula string, the results when Excel attempts to open the generated file are undefined.

    Alternative methods of logging

    The philosophy of JExcelApi has always been "download and go", hence JExcelApi has no dependencies on any third party libraries. This means that when it came to logging, JExcelApi has simply written messages to stdout and stderr. However, for more complex applications this simple behaviour may prove a little too rudimentary, so from version 2.4 it is possible (with a little technical know-how) to plug in more sophisticated logging mechanism. In order to preserve the principle of no dependencies, the JExcelApi distribution is built in stand-alone mode by default (with the simple logging mechanism) - to use an alternative means of logging requires changing a config file and rebuilding

    From version 2.4 JExcelApi has an internally defined abstract Logger class (common.Logger). This class contains a static method to access the singleton Logger instance together with a set of abstract methods (debug(), info(), warn() etc). At run time this class instantiates the appropriate subclass which implements these methods. Any class, whether in the guts of the JExcelApi code or within a JExcelApi client application, can access the logger by making a call to Logger.getLogger() and then call the appropriate logging method. To change the logger, the name of the instantiated logger (a constant string compiled in to the app) must be altered, a class must be developed which implements the Logger abstract methods and then JExcelApi must be recompiled.

    By way of illustration, the JExcelApi distribution includes the source code files to enable logging with Log4J. To switch to using this alternative method of logging follow these steps:

  • In the build directory open the file build.properties
  • Change the logger entry to read Log4JLogger
  • Modify the loggerClasspath entry to point to the log4j jar files
  • Rebuild the application by typing "ant"


  • (Alternatively, copy the file build_log4j.properties to build.properties and change the property values as appropriate).

    At compile time ant will perform the following steps:
  • The file ${logger}Name.java in the directory src/common/log is copied so that it overwrites the file LoggerName.java in the same directory. (This class simply contains the hard coded string constant containing the name of the qualified Logger implementation class)
  • the files src/common/log/LoggerName.java and src/common/log/${logger}.java are added to the ant compile list. Any other files in this directory are ignored


  • When the application is initialized within the JVM, the common.Logger class will read the name in common.log.LoggerName and attempt to instantiate this class. The implementation of this class can be as simple or as complex as requirements demand. It also means that the user with simple requirements can compile and run JExcelApi without the need to retrieve and configure third party libraries; whilst the more sophisticated user can retrieve and configure other packages as necessary.

    Hyperlinks

    Very long URLs do not appear to work when added as a HYPERLINK formula - this can be verified by typing something like the following into Excel itself
    =HYPERLINK("http://www.verylongurl.com/.....", "label")
    When the label is clicked, for long URLs Excel reports that it is unable to contact the server. The reason for this is that the URL value exceeds Excel's internal limit for parsing formulas

    To get around this the
    WritableSheet.addHyperlink(...)
    can be used instead

    Specifying and Disabling features

    JExcelApi supports a number of features by default (such as images, macros, formulas etc). If, for any reason you wish to turn these off, for example, in order to improve performance or to avoid a bug, the JExcelApi has two ways of doing this.

    The first way is by setting the System property via the command - simply specify the property name using the -D option on the command line eg. to turn of warnings using the jxl.nowarnings property do the following:
    java -Djxl.nowarnings=true -jar jxl.jar -rw mysheet.xls copy.xls
    The other way is to set the appropriate property on the WorkbookSettings bean that you use to create the workbook, as follows:
    File myfile = new File("...");
    WorkbookSettings settings = new WorkbookSettings();
    settings.setSuppressWarnings(true);
    Workbook workbook = Workbook.getWorkbook(myfile, settings);
    This is the list of features which may be disabled, the system property name and the corresponding method on the WorkbookSettings bean

    Feature System Property Workbook Settings Bean Method
    Don't display warnings during execution jxl.nowarnings setSuppressWarnings(boolean)
    Ignore all chart and image information jxl.nodrawings setDrawingsDisabled(boolean)
    Ignore named cells and ranges jxl.nonames setNamesDisabled(boolean)
    Ignore blank cells (this stops the instantiation of Blank cell objects, which can cause quite a lot of memory usage in very large spreadsheets) jxl.noblanks setIgnoreBlankCells(boolean)
    Ignore any cell validations jxl.nocellvalidation setCellValidationDisabled(boolean)
    Disable automatic garbage collection hints jxl.nogc setGCDisabled(boolean)
    Disable cell format rationalization (ie. potentially allow duplicate cell formats within a workbook) jxl.norat setRationalization(boolean)
    Disable adjusting of formula references when rows/columns are either inserted or deleted jxl.noformulaadjust setFormulaAdjust(boolean)
    Disable reading of additional property sets - this will have the effect of not copying macros or other VB components of a workbook jxl.nopropertysets setPropertySets(boolean)
    The encoding to use for reading non unicode strings file.encoding setEncoding(String)
    Don't check for overlapping merged cells or ranges during the write phase jxl.nomergedcellchecks setMergedCellChecking(boolean)
    Two letter locale language identifier jxl.lang setEncoding(String)
    Two letter locale country identifier jxl.country setEncoding(String)
    The directory where any temporary files will be writting jxl.temporaryfileduringwritedirectory setTemporaryFileDuringWriteDirectory(File))
    Use a temporary file to write out the final file, rather than generate the whole thing in memory jxl.usetemporaryfileduringwrite setTemporaryFileDuringWrite(boolean)


    Back to ExcelRead home