syndy1989 syndy1989 -3 years ago 155
Groovy Question

How to parse xml using groovy

I'm new to groovy xml parsing. I'm trying to parse the below xml file

<font face=Tahoma size=2>
Team,<br/><br/> Please find below the test summary details for the 'Test' execution.<br/><br/><b><U>Transaction Summary Table:</U></b><br/><br/>
<table border=1 CELLPADDING =3 style='font-family:Tahoma;font-size:12'>
<tr>
<b>
<th bgcolor=#C0C0C0> TransactionName </th>
<th bgcolor=#C0C0C0> AverageLatency </th>
<th bgcolor=#C0C0C0> MinimumLatency </th>
<th bgcolor=#C0C0C0> MaximumLatency </th>
<th bgcolor=#C0C0C0> AverageElapsedTime </th>
<th bgcolor=#C0C0C0> MinimumElapsedTime </th>
<th bgcolor=#C0C0C0> MaximumElapsedTime </th>
<th bgcolor=#C0C0C0> TotalCount </th>
<th bgcolor=#C0C0C0> PassPercentage </th>
</b>
</tr>
<tr>
<td>1 /aumentum/</td>
<td>
<center>1648.0</center>
</td>
<td>
<center>1240</center>
</td>
<td>
<center>2900</center>
</td>
<td>
<center>1907.0</center>
</td>
<td>
<center>1495</center>
</td>
<td>
<center>3140</center>
</td>
<td>
<center>45</center>
</td>
<td>
<center>100.0</center>
</td>
</tr>
<tr>
<td>T01_Aumentum_Home</td>
<td>
<center>6.0</center>
</td>
<td>
<center>1</center>
</td>
<td>
<center>10</center>
</td>
<td>
<center>1956.0</center>
</td>
<td>
<center>1490</center>
</td>
<td>
<center>3806</center>
</td>
<td>
<center>213</center>
</td>
<td>
<center>0.0</center>
</td>
</tr>
</tbody>
</table>
<br/><br/>Thanks,<br/>Performance Team.
</font>
<br/><br/>


Expected Result:

[{
"transaction name":"1 /aumentum/",
"AverageLatency ":"1648.0",
"Minimum latency":"1240",
"MaximumLatency ":"2900",
"AverageElapsedTime":"1907.0",
"MinimumElapsedTime":"1495",
"MaximumElapsedTime":"3140",
"TotalCount":"45",
"PassPercentage":"100.0"
},
{
"transaction name": "1 /aumentum/",
"AverageLatency ":"1648.0",
"Minimum latency":"1240",
"MaximumLatency ":"2900",
"AverageElapsedTime":"1907.0",
"MinimumElapsedTime":"1495",
"MaximumElapsedTime":"3140",
"TotalCount":"45",
"PassPercentage":"100.0"

}]


i have got the first children using values using
docParser.getElementsByTag("tr").first()


Here is the error I get:

Exception thrown
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at org.jsoup.select.Elements.get(Elements.java:519)
at org.jsoup.nodes.Element.child(Element.java:174)
at org.jsoup.nodes.Element$child$0.call(Unknown Source)
at CommonUtils.parseLRHTMLReport(jmeteragent.groovy:304)
at CommonUtils$parseLRHTMLReport.call(Unknown Source)


Here is what I have done so far:

def transactiondetails12 = null
def iterator12 = 0
int count1 = 0
def violcounts = 0
def violations = null;

tmpElement = docParser.getElementsByTag("tr").first()
println tmpElement.children()
// tmpElement= tmpElement.child(0)
// println "#########tmpElement#########:" +tmpElement


for (element in tmpElement.children()) {
if (iterator12 == 0) {
// transactiondetails1 = "<table border=1 CELLPADDING =3 style='font-family:Tahoma;font-size:12'><tr><b><th bgcolor=#C0C0C0>" +
element.child(0).text().trim() + "</th><th bgcolor=#C0C0C0>" + element.child(2).text().trim() + "</th><th bgcolor=#C0C0C0>" +
element.child(3).text().trim() + "</th><th bgcolor=#C0C0C0>" + element.child(4).text().trim() + "</th></b></tr>"
iterator12 = 1;
count1++;
// println "nqwlieufrh 2938ry `9p23dhWCDNJ p3fu89 Q2390RUD"+transactiondetails1
} else {
count1++;
if (count1 <= 5) {

// println "iterator1iterator1iterator1iterator1"+iterator1++
transactiondetails12 = transactiondetails12 + "<tr><td>" + element.child(0).text().trim() + "</td><td><center>" +
element.child(2).text().trim() + "</center></td><td><center>" +
element.child(3).text().trim() + "</center></td><td><center>" +
element.child(4).text().trim()
println "transactiondetails12" + transactiondetails12
// println "3215463654156436212315465123011482145634217225445622341"+element.child(4).text().trim()
String violation1 = element.child(1).text()
// violation=Integer.valueOf(violation1)
// violation=Integer.parseInt(violation1)

// if(violation1>=0)
if (violation1.length() > 0) {
violcounts++
}


}
}

}


I have no idea how to map the
tmpElement.children()
values. Any advise on this would be helpful. Thanks in advance.

Answer Source

The sample you have provided uses jsoup library that is useful for HTML DOM manipulation. The solution to your problem is to use correct selectors to extract the data.

Consider following example:

def headers = docParser.select("tr > th").collect { it.text() }
def result = []

docParser.select("tr:has(td)").each { tr ->
    def obj = [:]
    tr.select("td").eachWithIndex { Element td, int i ->
        obj[headers[i]] = td.text()
    }
    result << obj
}

println JsonOutput.prettyPrint(JsonOutput.toJson(result))
  • docParser.select("tr > th").collect { it.text() } collects table headers and stores them as an ordered List<String>
  • docParser.select("tr:has(td)") selects all rows (excluding table header) with data
  • tr.select("td").eachWithIndex iterates inside each row, collects the data and associates it with header by index i
  • the last line displays desired output to console

Output:

[
    {
        "TransactionName": "1 /aumentum/",
        "AverageLatency": "1648.0",
        "MinimumLatency": "1240",
        "MaximumLatency": "2900",
        "AverageElapsedTime": "1907.0",
        "MinimumElapsedTime": "1495",
        "MaximumElapsedTime": "3140",
        "TotalCount": "45",
        "PassPercentage": "100.0"
    },
    {
        "TransactionName": "T01_Aumentum_Home",
        "AverageLatency": "6.0",
        "MinimumLatency": "1",
        "MaximumLatency": "10",
        "AverageElapsedTime": "1956.0",
        "MinimumElapsedTime": "1490",
        "MaximumElapsedTime": "3806",
        "TotalCount": "213",
        "PassPercentage": "0.0"
    }
]

And here you can find full Groovy script I've used for experimenting with your example: https://gist.github.com/wololock/651a536dff4e104ebba0eef69d4ac3ea

I hope it helps.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download