R Question

Parsing xml to dataframe using R

I am sure this question was answered many times...however the data that I have I guess is bit unique. The following is a part of my dataset.

<?xml version="1.0" encoding="UTF-8"?>
<IOTModellerLog xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" DeviceID="7430180" ClientID="12324" FileCreationDate="2017-03-01T22:40:03" FileVersion="2" EventClassID="65535" IOTLogCreationDate="2017-03-01T12:29:54" SampleID="1" xsi:noNamespaceSchemaLocation="/opt/nds/ams_proxy/webapps/ams_proxy/WEB-INF/amsXmlSchema.xsd">
<Event EventTime="2017-02-27T18:33:58">
<IOTEvent State="PowerOn" />
<Event EventTime="2017-02-28T08:59:03">
<Event EventTime="2017-02-28T08:59:13">

I am trying to convert this into a dataframe with first column being the
. The expected format is as follows:

EventTime Model DataType DataValue
2017-02-28T08:59:13 1 1 0401
2017-02-28T08:59:15 1 5 070707

I have tried the following:

result <- xmlParse("demoxml.xml")
XML:::xmlAttrsToDataFrame(result["//Event"]) #This just prints only the time


I am not sure how do I get the DataEvent values along with the EventTime and take it in a data.frame.

Can someone help?

Answer Source

I used something like this

result <- xmlParse('text_XML.xml')
result_nodes = XML::getNodeSet(result , "//IOTModellerLog/Event")
rbindlist(lapply(result_nodes,function(x) data.frame(as.list(unlist(xmlToList(x))))),use.names = TRUE, fill = TRUE)

Where the result looks like this

   IOTEvent.State    .attrs.EventTime DataEvent.Model DataEvent.DataType DataEvent.DataValue
1:        PowerOn 2017-02-27T18:33:58              NA                 NA                  NA
2:             NA 2017-02-28T08:59:03               1                  1                0301
3:             NA 2017-02-28T08:59:13               1                  1                0401

I assume this is something you can work with :)

