Wednesday, April 14, 2010

Open XSLT Processing 2

Intro
To understand the context of the problem, please read the previous Open XSLT Processing post first. Mr. Kay has given me the hint of taking the DOMSource out of the equation. Thanks!

New Tests
The new implementation adds a DOM-to-TinyTree conversion step before the actual runs. In addition it now handles 3 namespaces. The source still can be found at
http://code.google.com/p/livcos/source/browse/proto/OpenXslt/src/proto/.

JVM options: -Xms1500M -Xmx1500M -XX:NewRatio=1

Here the new results:
0 content elements, 100 runs
nodes |composed |  pre-compiled  |  process pipe  | extension
    1 |   281ms |    17ms (-94%) |    35ms (-88%) |    50ms (-83%)
    2 |   161ms |     8ms (-95%) |    18ms (-89%) |    48ms (-70%)
    4 |   146ms |     7ms (-95%) |    17ms (-89%) |    51ms (-65%)
    6 |   137ms |     9ms (-94%) |    21ms (-85%) |    57ms (-58%)
    8 |   118ms |    10ms (-92%) |    26ms (-78%) |    74ms (-38%)
   10 |   100ms |    11ms (-89%) |    29ms (-71%) |    85ms (-15%)
   12 |    99ms |    13ms (-87%) |    34ms (-66%) |   104ms (  5%)
   15 |   104ms |    16ms (-85%) |    41ms (-61%) |   128ms ( 22%)
   20 |   106ms |    20ms (-81%) |    53ms (-50%) |   151ms ( 42%)
   40 |   119ms |    36ms (-70%) |    97ms (-19%) |   281ms (135%)
   60 |   135ms |    53ms (-61%) |   145ms (  7%) |   426ms (215%)
   80 |   152ms |    68ms (-55%) |   188ms ( 23%) |   559ms (266%)
  100 |   165ms |    84ms (-50%) |   230ms ( 39%) |   710ms (330%)
  200 |   247ms |   169ms (-32%) |   457ms ( 84%) |  1384ms (458%)
  400 |   417ms |   331ms (-21%) |   908ms (117%) |  2791ms (568%)
  600 |   578ms |   498ms (-14%) |  1372ms (137%) |  4192ms (624%)
  800 |   752ms |   664ms (-12%) |  1815ms (141%) |  5573ms (640%)
 1000 |   911ms |   835ms ( -9%) |  2269ms (148%) |  6937ms (660%)
 2000 |  1746ms |  1652ms ( -6%) |  4518ms (158%) | 13865ms (693%)
 3000 |  2571ms |  2449ms ( -5%) |  6738ms (162%) | 20759ms (707%)
 4000 |  3405ms |  3301ms ( -4%) |  9068ms (166%) | 27701ms (713%)
 6000 |  5061ms |  4932ms ( -3%) | 13570ms (168%) | 41572ms (721%)
 8000 |  6755ms |  6597ms ( -3%) | 18067ms (167%) | 55494ms (721%)
 9999 |  8368ms |  8234ms ( -2%) | 22611ms (170%) | 68887ms (723%)

1 content element, 100 runs
nodes |composed |  pre-compiled  |  process pipe  | extension
    1 |   286ms |    16ms (-95%) |    35ms (-88%) |    51ms (-83%)
    2 |   162ms |     9ms (-95%) |    26ms (-84%) |    49ms (-70%)
    4 |   145ms |     9ms (-94%) |    24ms (-84%) |    52ms (-64%)
    6 |   138ms |    11ms (-92%) |    32ms (-77%) |    60ms (-57%)
    8 |   123ms |    12ms (-90%) |    40ms (-68%) |    77ms (-38%)
   10 |   103ms |    14ms (-87%) |    45ms (-56%) |    90ms (-13%)
   12 |   102ms |    16ms (-84%) |    53ms (-49%) |   111ms (  8%)
   15 |   107ms |    20ms (-82%) |    67ms (-38%) |   142ms ( 32%)
   20 |   110ms |    25ms (-78%) |    87ms (-21%) |   147ms ( 33%)
   40 |   128ms |    46ms (-64%) |   155ms ( 21%) |   296ms (131%)
   60 |   149ms |    67ms (-55%) |   228ms ( 52%) |   431ms (189%)
   80 |   165ms |    83ms (-50%) |   296ms ( 79%) |   576ms (249%)
  100 |   184ms |   104ms (-44%) |   366ms ( 98%) |   722ms (290%)
  200 |   286ms |   206ms (-29%) |   735ms (156%) |  1453ms (407%)
  400 |   496ms |   408ms (-18%) |  1465ms (195%) |  2898ms (484%)
  600 |   694ms |   611ms (-13%) |  2200ms (216%) |  4371ms (529%)
  800 |   899ms |   813ms (-10%) |  2903ms (222%) |  5767ms (541%)
 1000 |  1103ms |  1022ms ( -8%) |  3618ms (228%) |  7189ms (551%)
 2000 |  2118ms |  2035ms ( -4%) |  7262ms (242%) | 14417ms (580%)
 3000 |  3144ms |  3069ms ( -3%) | 10871ms (245%) | 21687ms (589%)
 4000 |  4186ms |  4085ms ( -3%) | 14523ms (246%) | 28788ms (587%)
 6000 |  6221ms |  6152ms ( -2%) | 21792ms (250%) | 43101ms (592%)
 8000 |  8293ms |  8174ms ( -2%) | 29106ms (250%) | 57651ms (595%)
 9999 | 10363ms | 10247ms ( -2%) | 36428ms (251%) | 71705ms (591%)

2 content elements, 100 runs
nodes |composed |  pre-compiled  |  process pipe  | extension
    1 |   284ms |    15ms (-95%) |    38ms (-87%) |    53ms (-82%)
    2 |   166ms |     9ms (-95%) |    26ms (-84%) |    50ms (-70%)
    4 |   145ms |     9ms (-94%) |    31ms (-79%) |    53ms (-64%)
    6 |   136ms |    12ms (-91%) |    44ms (-68%) |    60ms (-56%)
    8 |   122ms |    14ms (-88%) |    53ms (-57%) |    82ms (-33%)
   10 |   107ms |    17ms (-84%) |    63ms (-41%) |   105ms ( -2%)
   12 |   104ms |    18ms (-83%) |    71ms (-33%) |   110ms (  5%)
   15 |   108ms |    23ms (-79%) |    87ms (-20%) |   148ms ( 35%)
   20 |   110ms |    28ms (-74%) |   108ms ( -3%) |   153ms ( 38%)
   40 |   136ms |    54ms (-60%) |   209ms ( 53%) |   302ms (122%)
   60 |   161ms |    80ms (-51%) |   311ms ( 92%) |   448ms (177%)
   80 |   179ms |   100ms (-44%) |   406ms (126%) |   600ms (234%)
  100 |   203ms |   127ms (-38%) |   506ms (148%) |   745ms (266%)
  200 |   329ms |   246ms (-26%) |  1003ms (204%) |  1503ms (356%)
  400 |   567ms |   491ms (-14%) |  2005ms (253%) |  3010ms (430%)
  600 |   807ms |   732ms (-10%) |  3002ms (271%) |  4528ms (460%)
  800 |  1052ms |   972ms ( -8%) |  3974ms (277%) |  6034ms (473%)
 1000 |  1301ms |  1220ms ( -7%) |  5001ms (284%) |  7548ms (479%)
 2000 |  2510ms |  2434ms ( -4%) | 10007ms (298%) | 15280ms (508%)
 3000 |  3754ms |  3660ms ( -3%) | 14989ms (299%) | 22667ms (503%)
 4000 |  4968ms |  4874ms ( -2%) | 20279ms (308%) | 30199ms (507%)
 6000 |  7394ms |  7320ms ( -2%) | 30054ms (306%) | 45115ms (510%)
 8000 |  9827ms |  9713ms ( -2%) | 40095ms (307%) | 59919ms (509%)
 9999 | 12290ms | 12150ms ( -2%) | 49756ms (304%) | 74072ms (502%)

4 content elements, 100 runs
nodes |composed |  pre-compiled  |  process pipe  | extension
    1 |   289ms |    17ms (-95%) |    41ms (-86%) |    52ms (-82%)
    2 |   163ms |    11ms (-94%) |    35ms (-79%) |    54ms (-67%)
    4 |   146ms |    12ms (-92%) |    46ms (-69%) |    55ms (-62%)
    6 |   140ms |    14ms (-90%) |    64ms (-55%) |    69ms (-51%)
    8 |   129ms |    18ms (-86%) |    76ms (-41%) |    80ms (-38%)
   10 |   109ms |    20ms (-81%) |    94ms (-14%) |   105ms ( -4%)
   12 |   109ms |    24ms (-78%) |   106ms ( -4%) |   129ms ( 17%)
   15 |   130ms |    32ms (-75%) |   148ms ( 13%) |   149ms ( 14%)
   20 |   117ms |    36ms (-69%) |   167ms ( 42%) |   163ms ( 39%)
   40 |   154ms |    72ms (-54%) |   329ms (113%) |   323ms (109%)
   60 |   180ms |    96ms (-47%) |   479ms (166%) |   486ms (170%)
   80 |   208ms |   123ms (-41%) |   632ms (204%) |   630ms (202%)
  100 |   239ms |   151ms (-37%) |   770ms (221%) |   813ms (238%)
  200 |   394ms |   304ms (-23%) |  1538ms (290%) |  1586ms (302%)
  400 |   689ms |   609ms (-12%) |  3143ms (355%) |  3189ms (362%)
  600 |   994ms |   905ms ( -9%) |  4686ms (371%) |  4713ms (374%)
  800 |  1303ms |  1235ms ( -6%) |  6312ms (384%) |  6367ms (388%)
 1000 |  1627ms |  1463ms (-11%) |  7711ms (373%) |  8002ms (391%)
 2000 |  3188ms |  3005ms ( -6%) | 15561ms (388%) | 15803ms (395%)
 3000 |  4743ms |  4587ms ( -4%) | 23475ms (394%) | 23901ms (403%)
 4000 |  6169ms |  6157ms ( -1%) | 31316ms (407%) | 31437ms (409%)
 6000 |  9391ms |  9285ms ( -2%) | 46588ms (396%) | 47771ms (408%)
 8000 | 12467ms | 12348ms ( -1%) | 62862ms (404%) | 63416ms (408%)
 9999 | 15567ms | 15487ms ( -1%) | 78131ms (401%) | 78577ms (404%)

10 content elements, 100 runs
nodes |composed |  pre-compiled  |  process pipe  | extension
    1 |   297ms |    19ms (-94%) |    57ms (-81%) |    54ms (-82%)
    2 |   167ms |    15ms (-91%) |    60ms (-65%) |    56ms (-67%)
    4 |   157ms |    18ms (-89%) |    87ms (-45%) |    62ms (-61%)
    6 |   148ms |    23ms (-85%) |   121ms (-19%) |    75ms (-50%)
    8 |   135ms |    28ms (-80%) |   152ms ( 12%) |    95ms (-30%)
   10 |   127ms |    33ms (-74%) |   183ms ( 44%) |   116ms ( -9%)
   12 |   126ms |    38ms (-70%) |   220ms ( 74%) |   137ms (  8%)
   15 |   136ms |    46ms (-67%) |   258ms ( 89%) |   151ms ( 11%)
   20 |   145ms |    60ms (-59%) |   333ms (128%) |   187ms ( 28%)
   40 |   186ms |   102ms (-45%) |   644ms (245%) |   371ms ( 99%)
   60 |   233ms |   152ms (-35%) |   971ms (315%) |   553ms (136%)
   80 |   286ms |   204ms (-29%) |  1275ms (344%) |   725ms (153%)
  100 |   335ms |   248ms (-26%) |  1598ms (377%) |   913ms (172%)
  200 |   583ms |   495ms (-15%) |  3174ms (444%) |  1841ms (215%)
  400 |  1075ms |   984ms ( -9%) |  6344ms (489%) |  3670ms (241%)
  600 |  1563ms |  1479ms ( -6%) |  9543ms (510%) |  5498ms (251%)
  800 |  2076ms |  1974ms ( -5%) | 12732ms (513%) |  7333ms (253%)
 1000 |  2580ms |  2486ms ( -4%) | 15952ms (518%) |  9159ms (254%)
 2000 |  5115ms |  5029ms ( -2%) | 32031ms (526%) | 18285ms (257%)
 3000 |  7576ms |  7508ms ( -1%) | 47963ms (533%) | 27429ms (262%)
 4000 | 10128ms | 10075ms ( -1%) | 64225ms (534%) | 36588ms (261%)
 6000 | 15168ms | 15140ms ( -1%) | 96086ms (533%) | 54834ms (261%)
 8000 | 20246ms | 20219ms ( -1%) |129007ms (537%) | 73107ms (261%)
 9999 | 25382ms | 25264ms ( -1%) |161209ms (535%) | 91117ms (258%)

100 content elements, 100 runs
nodes |composed |  pre-compiled  |  process pipe  | extension
    1 |   337ms |    53ms (-85%) |   272ms (-20%) |    97ms (-72%)
    2 |   222ms |    70ms (-69%) |   393ms ( 76%) |   117ms (-48%)
    4 |   242ms |   105ms (-57%) |   635ms (162%) |   159ms (-35%)
    6 |   268ms |   140ms (-48%) |   970ms (262%) |   179ms (-34%)
    8 |   263ms |   159ms (-40%) |  1106ms (320%) |   232ms (-12%)
   10 |   276ms |   186ms (-33%) |  1338ms (384%) |   278ms (  0%)
   12 |   308ms |   221ms (-29%) |  1584ms (414%) |   328ms (  6%)
   15 |   350ms |   270ms (-23%) |  1962ms (460%) |   407ms ( 16%)
   20 |   438ms |   355ms (-20%) |  2603ms (493%) |   529ms ( 20%)
   40 |   751ms |   665ms (-12%) |  5112ms (580%) |  1039ms ( 38%)
   60 |  1094ms |  1000ms ( -9%) |  7612ms (595%) |  1552ms ( 41%)
   80 |  1440ms |  1323ms ( -9%) | 10147ms (604%) |  2085ms ( 44%)
  100 |  1798ms |  1718ms ( -5%) | 12636ms (602%) |  2585ms ( 43%)
  200 |  3539ms |  3418ms ( -4%) | 25423ms (618%) |  5249ms ( 48%)
  400 |  7085ms |  7086ms (  0%) | 51728ms (630%) | 10561ms ( 49%)
  600 | 10931ms | 10822ms ( -1%) | 77671ms (610%) | 15795ms ( 44%)
  800 | 14680ms | 14577ms ( -1%) |103519ms (605%) | 21147ms ( 44%)
 1000 | 18560ms | 18817ms (  1%) |129679ms (598%) | 26345ms ( 41%)
 2000 | 36419ms | 36735ms (  0%) |258503ms (609%) | 52402ms ( 43%)
 3000 | 55917ms | 55606ms ( -1%) |384785ms (588%) | 77508ms ( 38%)
 4000 | 73902ms | 73556ms ( -1%) |519067ms (602%) |103688ms ( 40%)
 6000 |108664ms |109314ms (  0%) |781070ms (618%) |152491ms ( 40%)
 8000 |145420ms |143874ms ( -2%) |1039416ms (614%) |208028ms ( 43%)
 9999 |184118ms |184185ms (  0%) |1340946ms (628%) |259675ms ( 41%)

nodes: Number of input test elements 1..9999 (per namespace => 2..19998 child nodes).
composed: Approach 1 with one XSLT importing 2 others. Scanning the input for namespaces. (100%).
pre-compiled: Run 1a with one static, pre-compiled XSLT for comparison. Still scanning the input for namespaces.
process pipe: Approach 2 with 4-stage XSLTs. Scanning the input for namespaces.
extension: Approach 3 with one XSLT for the root namespace and with a separate transformation for every test element.

Noticed
The use of DOMSource slows things down, especially with large documents. Since the per-element transformation gets the input element as a TinyTree directly, the approach 3 has had an advantage over the others, having to deal with wrapped DOM nodes in the last implementation.

The results for large documents now make more sense. The per-element transformation does not suddenly outperform the one-stylesheet approach anymore and the pre-compiled approaches only score with smaller documents in all the tests.

As expected approach 3 benefits from a low foreign/root namespace ratio.

Conclusions
A good general solution for our problem is now the approach 3. Even though it performs worse than the approach 1 for large documents, it is faster on smaller ones.

The best solution is a combination of approach 1 (for large documents) and approach 3 (for smaller ones). Again a pre-compiled cache for often-used namespace sets improves the process of small documents further.

No comments:

Post a Comment