Defcon Defcon - 4 months ago 62
Java Question

Xml parsing on Apache Kafka

I am using Apache Kafka to read in an multiple xml files. I want to convert the xml files into a flat file (csv file or text file). I have an example output below:

I think converting xml into dom is a solution or using Jackson-xml data converter?

Can anyone comment on the best solution to achieve this? Thanks!

Input 1:

<?xml version="1.0" encoding="UTF-8"?>
<phoneNumbers type="work">555-1111</phoneNumbers>

Input 2:

<?xml version="1.0" encoding="UTF-8"?>
<phoneNumbers type="work">555-1111</phoneNumbers>


<?xml version="1.0" encoding="UTF-8"?><customer><id>123</id><firstName>Jane</firstName><phoneNumbers type="work">555-1234</phoneNumbers></customer>

<?xml version="1.0" encoding="UTF-8"?><customer><id>1234</id><firstName>Bob</firstName><phoneNumbers type="work">555-1111</phoneNumbers></customer>


Good question. One way to do it is bash... look below


for xml in *.xml
  echo "Processing $xml";
  id=$({ xmllint --xpath "string(//customer/id)" $xml; echo ","; } | tr "\n" " ");
  firstname=$({ xmllint --xpath "string(//customer/firstName)" $xml; echo ","; } | tr "\n" " ");
  phonenumber=$(xmllint --xpath "string(//customer/phoneNumbers)" $xml);
  printf "$line" >> combined.csv