Thursday, May 20, 2010

XML Namepool 2

Five different new NamePool implementations try to handle XML names most efficiently and with the smallest memory footprint possible.

Note, that these implementations are mostly prove-of-concept and have not yet been tested really!

Saxon 9.2
The current Saxon implementation of a name pool.

LivCos 0.7
The current, quick&practical name pool implementation for LivCos.

QualifedName Object
Strait forward approach, modeling the qualified name as simple Java objects.

LivCos2
An optimized implementation with a fingerprint array and hash map.

Saxon2
The optimized implementation for the Saxon interface.

Int-Wrapper for QualifiedName
Int index for an array of QualifiedNames to reduce DOM memory foot print on 64bit systems. Would allow to be integrated into Saxon.

String Array
Implementation of QualifiedName objects with a "simple" String array.

Evaluation
To evaluate the performance of these approaches, I've written a small, simple console application, stopping the time between tasks and roughly measuring memory usage. For the first run, the application generates 100'000 qualified names in 500 namespaces, each with 1 prefix (=> 200 local names per namespace) and adds them to the chosen namepool. Then it reads the different name parts, gets codes and compares parts of the name.
Protocol           SAXON           QNAME           COSMOS 
                 [ms]  [kB]      [ms]  [kB]      [ms]  [kB]
ALLOCATE         1644  2374       146 22713     13841 10178
READ_PREFIX      3365     0        53     0        35 -8593
READ_URI         3284     0        53     0        23     0
READ_LOCALNAME   3296     0        56     0        34     0
READ_QNAME       7615  6235       191  6235        22     0
READ_CLARKNAME  11545 -6235        53     0      1599  8437
GET_URI_CODE    53640     0       N/A   N/A     17262     0
GET_NS_CODE     95210     0       N/A   N/A       N/A   N/A
GET_NS_CODE2     3318     0       N/A   N/A       N/A   N/A
GET_PREFIX_CODE 32992     0       N/A   N/A       N/A   N/A
COMPARE_FP         14     0        71     0        26     0
COMPARE_NAME     3352     0        91     0        30     0
COMPARE_QNAME    3390     0        81     0        24     0
Every name is read and compared 100 times, the heap is on 1.5GB and it runs with the "-server" option.

Since Saxon handles "StandardNames" differently, a second run measures this set of standard names (~170).
Protocol           SAXON           STANDARD        SAXON2          STRARRAY
[ms]  [kB]      [ms]  [kB]      [ms]  [kB]      [ms]  [kB]
ALLOCATE          765    49       579     0       174   108       775 40049
READ_PREFIX        22     0        18   -20        35     0         9     0
READ_URI           24     0        18     0        26     0         9     0
READ_LOCALNAME     22     0        15     0        26     0         9     0
READ_QNAME        201    10       175    10       203    10        10    10
READ_CLARKNAME    328    10       321    10        29    10        10     0
GET_URI_CODE       71     0       N/A   N/A        57     0       N/A   N/A
GET_NS_CODE       125     0       N/A   N/A        74     0       N/A   N/A
GET_NS_CODE2       30     0       N/A   N/A        25     0       N/A   N/A
GET_PREFIX_CODE    58     0       N/A   N/A        36     0       N/A   N/A
COMPARE_FP         44     0        41     0        36     0        53     0
COMPARE_NAME      166     0        44     0        64     0        52     0
COMPARE_QNAME     182     0        56     0        81     0        50     0
In this run the names are "allocated" and read 10'000 times and compared 100'000 times.

The values for StandardNames differ less for the different implementations. Maybe there is a chance for Saxon to simplify, optimize the handling for StandardNames.

Optimize Saxon 9.2
The namepool in Saxon focuses on fingerprint comparison. I guess, it is important for processing XSLT efficiently. The optimized implementation for Saxon also uses a fingerprint bit mask on the name code. It uses some additional arrays and hash maps.
Protocol           SAXON           SAXON2          STRARRAY
[ms]  [kB]      [ms]  [kB]      [ms]  [kB]
ALLOCATE         1644  2374       240  2620       143 14822
READ_PREFIX      3365     0        94     0        25     0
READ_URI         3284     0        79     0        25     0
READ_LOCALNAME   3296     0        76     0        26     0
READ_QNAME       7615  6235       923  6235       119  6235
READ_CLARKNAME  11545 -6235       148  2671        27     0
GET_URI_CODE    53640     0       786     0       N/A   N/A
GET_NS_CODE     95210     0      1385     0       N/A   N/A
GET_NS_CODE2     3318     0        58     0       N/A   N/A
GET_PREFIX_CODE 32992     0       553     0       N/A   N/A
COMPARE_FP         14     0        14     0        34     0
COMPARE_NAME     3352     0        29     0        38     0
COMPARE_QNAME    3390     0        90     0        35     0
The StringArray uses the index as the name code. Still the compares seem to be really fast. Theoretically it would also allow to replace the Saxon namepool (int name codes), but it needs double the time to compare fingerprints (how big of a drop in performance for Saxon would that be?).

To try the Saxon2 implementation with the Saxon processor you also need to replace the StandardNames class with this code.

QualifiedName Objects
The major drawback for the direct approach, modeling the names as individual objects, is the need for every DOM node to hold an object reference (64bit systems). Another disadvantage is the difficulty to integrate into the existing Saxon processor. Both could be worked around by putting the objects into an array and working with the array indexes as the name codes.
Protocol           QNAME           QNAMEINT        STRARRAY
[ms]  [kB]      [ms]  [kB]      [ms]  [kB]
ALLOCATE          146 22713        82 10890       143 14822
READ_PREFIX        53     0        46     0        25     0
READ_URI           53     0        45     0        25     0
READ_LOCALNAME     56     0        45     0        26     0
READ_QNAME        191  6235       113  6235       119  6235
READ_CLARKNAME     53     0       118  8906        27     0
GET_URI_CODE      N/A   N/A       N/A   N/A       N/A   N/A
GET_NS_CODE       N/A   N/A       N/A   N/A       N/A   N/A
GET_NS_CODE2      N/A   N/A       N/A   N/A       N/A   N/A
GET_PREFIX_CODE   N/A   N/A       N/A   N/A       N/A   N/A
COMPARE_FP         71     0        68     0        34     0
COMPARE_NAME       91     0        66     0        38     0
COMPARE_QNAME      81     0        66     0        35     0
While the String Array is not really stores QualifiedName objects, it similarly compounds the five strings of a name in an block within the array.

As expected the large number of objects use a lot of memory, but the large java.util.HashMaps are somehow missing in the measurement...

Optimize LivCos
In the context of LivCos we have one big namepool, we don't need to compare fingerprint a lot and there can be many local names in the Cosmos namespace. For the SAX interface the QName needs to be read a lot.
Protocol           COSMOS          COSMOS3         STRARRAY        SAXON2       
[ms]  [kB]      [ms]  [kB]      [ms]  [kB]      [ms]  [kB]     
ALLOCATE        13841 10178       272  2578       143 14822       240  2620     
READ_PREFIX        35 -8593        75     0        25     0        94     0     
READ_URI           23     0        35     0        25     0        79     0     
READ_LOCALNAME     34     0        34     0        26     0        76     0     
READ_QNAME         22     0       793  6235       119  6235       923  6235     
READ_CLARKNAME   1599  8437       828  2671        27     0       148  2671     
GET_URI_CODE    17262     0       792     0       N/A   N/A       786     0     
GET_NS_CODE       N/A   N/A       N/A   N/A       N/A   N/A      1385     0     
GET_NS_CODE2      N/A   N/A       N/A   N/A       N/A   N/A        58     0     
GET_PREFIX_CODE   N/A   N/A       564     0       N/A   N/A       553     0     
COMPARE_FP         26     0        14     0        34     0        14     0     
COMPARE_NAME       30     0        29     0        38     0        29     0     
COMPARE_QNAME      24     0        40     0        35     0        90     0     
While COSMOS3 is an adapted SAXON2 copy, the StringArray seems to be the way to go. The key for the main hash map could be optimized and the clark name, since not really needed, could be removed.

Notes
- optimal fingerprint compare requires encoded name codes
- name code constants for StandardNames to be used in switch statements
- index or reference to the full qualified name allows simple caching of qNames and clarkNames
- self-made hash-to-int maps need improvement
- server HotSpot JVM optimizes the test noticeably

some values for the "-client" JVM
Protocol           SAXON           QNAME           STRARRAY        SAXON2          COSMOS3 
                 [ms]  [kB]      [ms]  [kB]      [ms]  [kB]      [ms]  [kB]      [ms]  [kB]
ALLOCATE         1756  2374        96 22713        68 14822       163  2620       164  2578
READ_PREFIX      3517     0       155     0        61     0       122     0       115     0
READ_URI         3445     0       153     0        64     0       102     0        75     0
READ_LOCALNAME   3421     0       158     0        59     0        75     0        68     0
READ_QNAME       8094  6235       241  6235       122  6235      1211  6235      1230  6235
READ_CLARKNAME  11983 -6235       150     0       138     0       157  2671      1372  2671
GET_URI_CODE    78621     0       N/A   N/A       N/A   N/A       967     0       968     0
GET_NS_CODE    106389     0       N/A   N/A       N/A   N/A      1486     0       N/A   N/A
GET_NS_CODE2     3457     0       N/A   N/A       N/A   N/A        89     0       N/A   N/A
GET_PREFIX_CODE 26850     0       N/A   N/A       N/A   N/A       390     0       392     0
COMPARE_FP         30     0       130     0        54     0        30     0        30     0
COMPARE_NAME     3515     0       130     0        53     0        50     0        52     0
COMPARE_QNAME    3485     0       139     0        93     0        91     0        91     0

No comments:

Post a Comment