Parsing very small XML: beware of overheads

I recently needed to parse some very tiny chunks of XML, for a toy project related to OpenStreetMap data.

Data ranged from

  • Just one node with 3 or 4 small attributes
  • 10-20 nodes with a total of 50-80 attributes

First reaction is to just use the DOM API. Performance was just terrible. The usual suspects were checked: the DocumentBuilderFactory was correctly kept, the DocumentBuilder itself was correctly reused between XML documents.

So, it is common knowledge that “SAX parsing is faster than DOM”. That is true to some extent. DOM parsing definitely can’t handle very large XML as everything is instantiated at once, but the DOM instantiation itself is often quite quick compared to XML parsing.

So, switching to SAX … No real improvement. Let’s use the very first performance monitoring tool available ! Thread dump using kill -3 :

 java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
- locked <0x00000007ad82ba98> (a
at java.lang.Throwable.<init>(
at java.lang.Exception.<init>(
at java.lang.RuntimeException.<init>(
at SAXParsersTest.timeOne(

And basically, here is our answer. On very small XML documents, a huge part of the time is actually spent in startup time of the parser, not in the parsing itself. Each time you call parse, it does some entity management initialization, which, somewhere throws an exception (probably because a property is not present). Exceptions throwing is not really optimized by the JIT, so you pay a huge overhead each time.

Fortunately, JAXB allows you to easily change the implementation. We can for example try to go from the default one (Xerces) to the Piccolo implementation. Piccolo publishes some benchmarks that show up to 2 times gain when using “small” XML (500 bytes).

What happens on very tiny XML ?

  • Times are in milliseconds for 100K loops
  • All loops have been preloaded to eliminate any JIT effect
  • I’ve added the old Crimson parser to compare


                             picco       xerces   factor        crimson
"empty" (1 node, 1 attr)      80          1153    14.4           371
tiny    (a few nodes)         81          1161    14.3           368
lessTiny (a few dozens)      490          1654     3.4           958
big      (hundreds)         4015          4638     1.2          5590
veryBig (several K)        39200         36834     0.9         52240

Yup, to parse large number of extremely small XML fragments with only a few nodes, Picco is 14 times faster than the default implementation ! Which, all taken into account, lead to a 5-10 times increase in the speed of my OSM parsing job !
When we get to very big XML, both implementations are almost head-to-head.

So, I would probably not recommend switching your default implementation, as Xerces is probably more mature and might support some more exotic stuff, but if you have some very small XML to parse, then definitely go for Picco ! And beware of startup times and overheads in your code, not only of raw sustained throughput.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>