PNS PNS - 4 months ago 24
JSON Question

Dynamic Conversion of XML to JSON

Is there any way (with Java code examples, if possible) to convert, on the fly, an

XML
input to
JSON
, without any knowledge of the actual contents and the structure of the XML source (file, string, etc.)?

Say, for instance, that one has a very large XML dataset with unknown structure and multiple nesting levels, stored to a big text file. Reading everything into memory is not possible (for lack of space) and they want to convert this into
JSON
directly, i.e., without having to write any code to detect and handle the StAX tags (e.g.,
START_ELEMENT
,
CHARACTERS
,
END_ELEMENT
).

The ideal solution would be to get a
Reader
or
InputStream
from the converter, so that, for instance, one supplies the XML file and the
Reader
or
InputStream
produces
JSON
, to be fed to a FileOutputStream, or even directly to a
JSON
parser like
Jackson
. If that is not possible, at least a way of progressively reading an XML file, converting to JSON and writing to another file would be an acceptable compromise.

Tools that can be used for converting from/to XML/JSON (e.g., StaxON, JSON-lib, Jettison, XStream) do not seem to do that but only conversion of a known structure.

EDIT: Getting a
Reader
or
InputStream
from an OutputStream or a Writer (which would also cover the "conversion" I spoke of above), can be done in a number of ways, although for best results and "infinite" input sizes multithreading is involved. Solutions are described in this article in Ostermiller.org and a similar implementation can be found in the IO-Tools library.

Answer

Here's a trivial example using Java's built-in StAX implementation to parse XML and Jettison to produce JSON from it. The XMLEventWriter has a convenient add(XMLEventWriter) method for bridging a reader to a writer, making this super-simple:

import org.codehaus.jettison.mapped.MappedXMLOutputFactory;

import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLEventWriter;
import javax.xml.stream.XMLInputFactory;
import java.io.StringReader;
import java.util.HashMap;

public class Main {
    public static void main(String[] args) throws Exception {
        String xml =
            "<root><foo>foo string</foo><bar><x>1</x><y>5</y></bar></root>";
        XMLEventReader reader = XMLInputFactory.newInstance()
            .createXMLEventReader(new StringReader(xml));
        XMLEventWriter writer = new MappedXMLOutputFactory(new HashMap())
            .createXMLEventWriter(System.out);
        writer.add(reader);
        writer.close();
        reader.close();
    }
}

I've created a self-contained Maven project demonstrating this on Github.