Defcon Defcon - 26 days ago 18
Java Question

Xml parsing on Apache Kafka

I am using Apache Kafka to read in an multiple xml files. I want to convert the xml files into a flat file (csv file or text file). I have an example output below:

I think converting xml into dom is a solution or using Jackson-xml data converter?

Can anyone comment on the best solution to achieve this? Thanks!

Input 1:

<?xml version="1.0" encoding="UTF-8"?>
<customer>
<id>123</id>
<firstName>Jane</firstName>
<phoneNumbers type="work">555-1111</phoneNumbers>
</customer>


Input 2:

<?xml version="1.0" encoding="UTF-8"?>
<customer>
<id>1234</id>
<firstName>Bob</firstName>
<phoneNumbers type="work">555-1111</phoneNumbers>
</customer>


Output:

<?xml version="1.0" encoding="UTF-8"?><customer><id>123</id><firstName>Jane</firstName><phoneNumbers type="work">555-1234</phoneNumbers></customer>

<?xml version="1.0" encoding="UTF-8"?><customer><id>1234</id><firstName>Bob</firstName><phoneNumbers type="work">555-1111</phoneNumbers></customer>

Answer

Good question. One way to do it is bash... look below

#!/bin/bash

>combined.csv
for xml in *.xml
do
  echo "Processing $xml";
  id=$({ xmllint --xpath "string(//customer/id)" $xml; echo ","; } | tr "\n" " ");
  firstname=$({ xmllint --xpath "string(//customer/firstName)" $xml; echo ","; } | tr "\n" " ");
  phonenumber=$(xmllint --xpath "string(//customer/phoneNumbers)" $xml);
  line="${id}${firstname}${phonenumber}\n"
  printf "$line" >> combined.csv
done
Comments