Pramod Almeti Pramod Almeti - 3 months ago 13
C# Question

C# saxonapi.Evaluate taking too long to run Xquery on 500MB XML with 13Million lines

Application compiled 64bit running on 4CPU 16GBRAM. SaxonApi.Evaluate is taking 47 minutes of total time (60 Minutes) for 3 Evaluate calls on 500MB xml file with 13 million lines. Every Evaluate runs a XQuery which returns 80,000 items and each item has 20 nodes.

Is there anything we have do to improve the SaxonApi.Evaluate method

Answer

Firstly, there are a few anomalies here.

  • The query won't compile because it uses a namespace prefix "x" that hasn't been declared. (But the source document doesn't appear to use a namespace)
  • The query refers to the top-level element as x:top but in the source document it is Top
  • A couple of variables are bound to false when surely false() is intended (Saxon gives a warning about this).

Secondly, there are many variables being declared that are not used. For example, $PartnersCapitalBOY and $genPartnersCapitalBOY. In principle it's easy enough for an optimizer to ignore variables that aren't used, but giving an optimizer unnecessary work to do isn't always a good idea, because it can distract it from finding the patterns where optimization can make a real difference.

Thirdly, I'm suspicious about the repeated use of the construct:

(for) $PartnerInformation at $currentPartnerInformationPos 
 in if(exists($Sch3K1/x:PartnerInformation)) 
    then $Sch3K1/x:PartnerInformation 
    else element{'PartnerInformation'} {''},

The problem here is that constructs that create a new element can't be moved out of a loop, because XQuery is very fussy about the fact that such a construct must create a different element every time it is executed. So (without actually examining what the optimizer does in detail) I would suspect that this construct inhibits the optimizations that are possible.

Fourth, the clauses:

let $prevSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+1]

might be more efficient if ./x:top/x:level1/x:Sch3K1 is bound to a global variable.

At first sight your query is absolutely horrendous, with 9 nested loops each iterating over 80K elements: a naive implementation would execute the innermost code around 10^45 times, so if the innermost code takes one nanosecond to execute, the total query would take 10^36 seconds, which given that the age of the universe is less than 10^16 seconds, is rather a long time. So if this is running in an hour, the optimizer has done a pretty good job.

The only reason it is able to do such a good job is that so much of the query is obviously pointless.

Looking at the optimizer trace (-explain) I'm actually surprised how few optimizations are being done, and I suspect the main cause of this is the element constructors in the middle of the "for" clauses.

I would start by simplifying the query:

  1. Eliminate all the unused variables
  2. If you really need to create the dummy elements in order to achieve an outer join, create these dummy elements once as global variables rather than creating them repeatedly within the loop.

With these changes, the logic may become clearer. I think that in essence, it's actually a very simple query.