Thirlan Thirlan - 3 months ago 42
Java Question

Jena 3.0.1 and 3.1.0, RDF/XML to JSON-LD missing prefixes

We recently switched to Jena 3.1.0 from 3.0.1 and have found that there has been a change in the way Jena writes out JSON-LD to string format.

Below is what the JSON-LD looks like with Jena 3.0.1:

{
"@graph" : [ {
"@id" : "data:4d1a75b0-484f-4dfa-998f-4382f34e411f",
"@type" : "assertion:assertion",
"data:UUID" : "4d1a75b0-484f-4dfa-998f-4382f34e411f"
}, {
"@id" : "data:UUID",
"@type" : "owl:DatatypeProperty",
"rdfs:label" : {
"@language" : "en",
"@value" : "UUID"
}
}, {
"@id" : "urn:example.data.1.0",
"@type" : "owl:Ontology",
"rdfs:comment" : {
"@language" : "en",
"@value" : "This is an OWL ontology to describe data."
},
"rdfs:label" : {
"@language" : "en",
"@value" : "Data ontology"
},
"owl:versionInfo" : "1.0"
}, {
"@id" : "assertion:assertion",
"@type" : "owl:Class",
"subClassOf" : "data:entity"
} ],
"@context" : {
"comment" : {
"@id" : "http://www.w3.org/2000/01/rdf-schema#comment",
"@type" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#langString"
},
"label" : {
"@id" : "http://www.w3.org/2000/01/rdf-schema#label",
"@type" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#langString"
},
"versionInfo" : {
"@id" : "http://www.w3.org/2002/07/owl#versionInfo",
"@type" : "http://www.w3.org/2001/XMLSchema#string"
},
"UUID" : {
"@id" : "urn:example.data#UUID",
"@type" : "http://www.w3.org/2001/XMLSchema#string"
},
"subClassOf" : {
"@id" : "http://www.w3.org/2000/01/rdf-schema#subClassOf",
"@type" : "@id"
},
"data" : "urn:example.data#",
"rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"owl" : "http://www.w3.org/2002/07/owl#",
"xsd" : "http://www.w3.org/2001/XMLSchema#",
"rdfs" : "http://www.w3.org/2000/01/rdf-schema#",
"assertion" : "urn:example.data.assertion#"
}
}


Below is what the JSON-LD looks like in Jena 3.1.0:

{
"@graph" : [ {
"@id" : "data:4d1a75b0-484f-4dfa-998f-4382f34e411f",
"@type" : "assertion:assertion",
"UUID" : "4d1a75b0-484f-4dfa-998f-4382f34e411f"
}, {
"@id" : "data:UUID",
"@type" : "owl:DatatypeProperty",
"label" : {
"@language" : "en",
"@value" : "UUID"
}
}, {
"@id" : "urn:example.data.1.0",
"@type" : "owl:Ontology",
"comment" : {
"@language" : "en",
"@value" : "This is an OWL ontology to describe data."
},
"label" : {
"@language" : "en",
"@value" : "Data ontology"
},
"versionInfo" : "1.0"
}, {
"@id" : "assertion:assertion",
"@type" : "owl:Class",
"subClassOf" : "data:entity"
} ],
"@context" : {
"comment" : {
"@id" : "http://www.w3.org/2000/01/rdf-schema#comment"
},
"label" : {
"@id" : "http://www.w3.org/2000/01/rdf-schema#label"
},
"versionInfo" : {
"@id" : "http://www.w3.org/2002/07/owl#versionInfo"
},
"UUID" : {
"@id" : "urn:example.data#UUID"
},
"subClassOf" : {
"@id" : "http://www.w3.org/2000/01/rdf-schema#subClassOf",
"@type" : "@id"
},
"data" : "urn:example.data#",
"rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"owl" : "http://www.w3.org/2002/07/owl#",
"xsd" : "http://www.w3.org/2001/XMLSchema#",
"rdfs" : "http://www.w3.org/2000/01/rdf-schema#",
"assertion" : "urn:example.data.assertion#"
}
}


The difference between the two is that the namespace prefixes data: and rfds: no longer appear next to the tags like UUID and label.

The JSON-LD is valid according to Jena but unfortunately we need to send the JSON-LD to a server that is expecting those prefixes to be there.

Is there anything we can do to control the output? We are not experts with Jena, please handle us with kid gloves : (

Below is the original message in XML format:

<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:data="urn:example.data#"
xmlns:assertion="urn:example.data.assertion#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:owl="http://www.w3.org/2002/07/owl#">
<owl:Ontology rdf:about="urn:example.data.1.0">
<owl:versionInfo>1.0</owl:versionInfo>
<rdfs:label xml:lang="en">Data ontology</rdfs:label>
<rdfs:comment xml:lang="en">This is an OWL ontology to describe data.</rdfs:comment>
</owl:Ontology>
<owl:Class rdf:about="urn:example.data.assertion#assertion">
<rdfs:subClassOf rdf:resource="urn:example.data#entity"/>
</owl:Class>
<owl:DatatypeProperty rdf:about="urn:example.data#UUID">
<rdfs:label xml:lang="en">UUID</rdfs:label>
</owl:DatatypeProperty>
<assertion:assertion rdf:about="urn:example.data#4d1a75b0-484f-4dfa-998f-4382f34e411f">
<data:UUID>4d1a75b0-484f-4dfa-998f-4382f34e411f</data:UUID>
</assertion:assertion>
</rdf:RDF>


And below is the minimal code we are running:

InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("convert-xml-json-test/temp.xml");
String inputXml = IOUtils.toString(inputStream);

// Convert the XML to RDF model
StringReader stringReader = new StringReader(inputXml);
Model model = ModelFactory.createDefaultModel();
model.read(stringReader, null, RDFLanguages.RDFXML.getLabel());

// Convert the model to JSON String
ByteArrayOutputStream out = new ByteArrayOutputStream();
model.write(out, RDFLanguages.JSONLD.getLabel());
outputJson = out.toString(StandardCharsets.UTF_8.toString());


We are extremely confident that it is due to a change in Jena since our minimal test project only includes Jena as seen in the mvn dependency:tree below

+- org.apache.jena:jena-tdb:jar:3.1.0:compile
| +- org.apache.jena:jena-arq:jar:3.1.0:compile
| | +- org.apache.jena:jena-core:jar:3.1.0:compile
| | | +- org.apache.jena:jena-iri:jar:3.1.0:compile
| | | +- xerces:xercesImpl:jar:2.11.0:compile
| | | | \- xml-apis:xml-apis:jar:1.4.01:compile
| | | +- commons-cli:commons-cli:jar:1.3:compile
| | | \- org.apache.jena:jena-base:jar:3.1.0:compile
| | | \- com.github.andrewoma.dexx:collection:jar:0.6:compile
| | +- org.apache.jena:jena-shaded-guava:jar:3.1.0:compile
| | +- org.apache.httpcomponents:httpclient:jar:4.2.6:compile
| | | +- org.apache.httpcomponents:httpcore:jar:4.2.5:compile
| | | \- commons-codec:commons-codec:jar:1.6:compile
| | +- com.github.jsonld-java:jsonld-java:jar:0.7.0:compile
| | | +- com.fasterxml.jackson.core:jackson-core:jar:2.3.3:compile
| | | +- com.fasterxml.jackson.core:jackson-databind:jar:2.3.3:compile
| | | | \- com.fasterxml.jackson.core:jackson-annotations:jar:2.3.0:compile
| | | \- commons-io:commons-io:jar:2.4:compile
| | +- org.apache.httpcomponents:httpclient-cache:jar:4.2.6:compile
| | +- org.apache.thrift:libthrift:jar:0.9.2:compile
| | +- org.slf4j:jcl-over-slf4j:jar:1.7.20:compile
| | +- org.apache.commons:commons-csv:jar:1.0:compile
| | \- org.apache.commons:commons-lang3:jar:3.3.2:compile
| \- org.slf4j:slf4j-api:jar:1.7.20:compile
\- junit:junit:jar:4.11:test
\- org.hamcrest:hamcrest-core:jar:1.3:test

Answer

This is the code we created to resolve our prefix issue. We've sent 100+ messages with no errors. The code works by first parsing the OWL ontology in the @context section and building a mapping from non-prefixed to prefixed. Then the @graph is traversed and the prefixes are applied.

package utils.helper;

import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ObjectNode;


public class ReapplyJenaPrefixes {

    public String reapplyPrefixes(String jsonString) throws IOException {
        ObjectMapper jacksonParser = new ObjectMapper();
        JsonNode jsonNode = jacksonParser.readTree(jsonString);

        Map<String, String> prefixMap = buildPrefixedTagMap(jsonNode);
        applyPrefixes(jsonNode, prefixMap);

        return jacksonParser.writeValueAsString(jsonNode);
    }

    public void reapplyPrefixes(JsonNode node) {
        Map<String, String> prefixMap = buildPrefixedTagMap(node);
        applyPrefixes(node, prefixMap);
    }

    private Map<String, String> buildPrefixedTagMap(JsonNode node) {
        Map<String, Boolean> filteredWords = new HashMap<String, Boolean>();
        filteredWords.put("subClassOf", true);

        JsonNode contextNode = node.get("@context");
        List<Entry<String, String>> tagList = new ArrayList<Entry<String, String>>();
        Map<String, String> prefixMap = new HashMap<String, String>();
        Map<String, String> prefixedTagMap = new HashMap<String, String>();
        Iterator<Entry<String, JsonNode>> iterator = contextNode.fields();

        JsonNode currentNode;
        String currentNodeName;
        EntryImpl<String, String> tagEntry;
        while(iterator.hasNext()) {
            Entry<String, JsonNode> e = iterator.next();
            currentNode = e.getValue();
            currentNodeName = e.getKey();
            if(currentNode.isTextual()) {
                prefixMap.put(currentNode.textValue(), currentNodeName);

            } else if(!filteredWords.containsKey(currentNodeName)) {
                tagEntry = new EntryImpl<String, String>(currentNodeName, currentNode.get("@id").asText());
                tagList.add(tagEntry);
            }
        }

        String tagName;
        String namespace;
        String prefix;
        for(Entry<String, String> e : tagList) {
            tagName = e.getKey();
            namespace = e.getValue();

            // strip the tagName
            namespace = namespace.substring(0, namespace.length() - tagName.length());

            // lookup the prefix
            prefix = prefixMap.get(namespace);

            prefixedTagMap.put(tagName, prefix+":"+tagName);
        }

        return prefixedTagMap;
    }

    private void applyPrefixes(JsonNode node, Map<String, String> prefixMap) {
        JsonNode contextNode = node.get("@graph");

        ObjectNode currentNode = null;
        String prefixedTag = null;
        String fieldName = null;

        JsonNode topLevelFieldNode = null;
        Iterator<String> topLevelFieldNameIterator = null;
        List<String> topLevelFieldNameList;

        JsonNode subLevelFieldNode = null;
        Iterator<String> subLevelFieldNameIterator = null;
        List<String> subLevelFieldNameList;

        Iterator<JsonNode> arrayIterator = contextNode.elements();
        while(arrayIterator.hasNext()) {
            currentNode = (ObjectNode)arrayIterator.next();

            // Can't modify an iterator while iterating so store the field names in a list first
            topLevelFieldNameIterator = currentNode.fieldNames();
            topLevelFieldNameList = new ArrayList<String>();
            while(topLevelFieldNameIterator.hasNext()) {
                fieldName = topLevelFieldNameIterator.next();
                if(fieldName.charAt(0) != '@') {
                    topLevelFieldNameList.add(fieldName);
                }
            }

            for(String topLevelFieldName : topLevelFieldNameList) {

                topLevelFieldNode = currentNode.get(topLevelFieldName);

                prefixedTag = prefixMap.get(topLevelFieldName);
                if(prefixedTag != null

                        // Data tags don't seem to have prefixes on them
                        && (topLevelFieldNode.isTextual()
                        && !topLevelFieldNode.textValue().startsWith("data:"))) {
                    currentNode.remove(topLevelFieldName);
                    currentNode.set(prefixedTag, topLevelFieldNode);
                }

                if(topLevelFieldNode.isObject()) {
                    // Can't modify an iterator while iterating so store the field names in a list first
                    subLevelFieldNameIterator = topLevelFieldNode.fieldNames();
                    subLevelFieldNameList = new ArrayList<String>();
                    while(subLevelFieldNameIterator.hasNext()) {
                        fieldName = subLevelFieldNameIterator.next();
                        if(fieldName.charAt(0) != '@') {
                            subLevelFieldNameList.add(fieldName);
                        }
                    }

                    for(String subLevelFieldName : subLevelFieldNameList) {
                        subLevelFieldNode = topLevelFieldNode.get(subLevelFieldName);

                        prefixedTag = prefixMap.get(topLevelFieldName);
                        if(prefixedTag != null) {
                            ((ObjectNode)topLevelFieldNode).remove(subLevelFieldName);
                            ((ObjectNode)topLevelFieldNode).set(prefixedTag, subLevelFieldNode);
                        }
                    }
                }
            }
        }
    }

    private class EntryImpl<K, V> implements Entry<K, V> {

        private K k;
        private V v;

        public EntryImpl(K k, V v) {
            this.k = k;
            this.v = v;
        }

        @Override
        public K getKey() {
            return k;
        }

        @Override
        public V getValue() {
            return v;
        }

        @Override
        public V setValue(V value) {
            V oldV = v;
            v = value;
            return oldV;
        }

    }
}