user1163234 user1163234 - 5 months ago 14
Java Question

How to parse a page with multiple tables

Any idea on how to scrape a web page with multiple tables?
I am connecting to the web page

This is one table but on the same web page there are multiple tables

I also cant figure out how to read the table...

XML:

<p><a href="/fantasy_news/feature/?ID=49818"><strong>Top 300 Overall Fantasy Rankings</strong></a></p>
<div class="storyStats">
<table>
<thead>
<tr>
<th>RANK</th>
<th>CENTRES</th>
<th>TEAM</th>
<th>POS</th>
<th>GP</th>
<th>G</th>
<th>A</th>
<th>PTS</th>
<th>+/-</th>
<th>PIM</th>
<th>PPP</th>
</tr>
</thead>
<tbody>
<tr class="bg1">
<td>1.</td>
<td><a href="/nhl/teams/players/?name=steven+stamkos">Steven&nbsp;Stamkos</a></td>

<td>Tampa Bay</td>
<td>C</td>
<td align="right">81</td>
<td align="right">50</td>
<td align="right">51</td>
<td align="right">101</td>
<td align="right">-2</td>
<td align="right">56</td>
<td align="right">38</td>
</tr>


Iterator<Element> trSIter = doc.select("table")
.iterator();
while (trSIter.hasNext()) {
Element trEl = trSIter.next().child(0);
Elements tdEls = trEl.children();
Iterator<Element> tdIter = tdEls.select("tr").iterator();
System.out.println("><1><><"+tdIter);
boolean firstRow = true;
while (tdIter.hasNext()) {

Element tr = (Element) tdIter.next();


while (tdIter.hasNext()) {
int tdCount = 1;
Element tdEl = tdIter.next();
//name = tdEl.getElementsByClass("playertablePlayerName").get(0).text();

Elements tdsEls = tdEl.select("td");
System.out.println("><2><><"+tdsEls);
Iterator<Element> columnIt = tdsEls.iterator();

while (columnIt.hasNext()) {

Element column = columnIt.next();
switch (tdCount++) {
case 1:
name =column.select("a").first().text();

break;
case 2:
stat2 = Double.parseDouble(column.text());
break;
case 3:
stat3 = Double.parseDouble(column.text());
break;
case 4:
stat4 = Double.parseDouble(column.text());
break;
case 5:
stat5 = Double.parseDouble(column.text());
break;
case 6:
stat6 = Double.parseDouble(column.text());
break;
case 7:
stat7 = Double.parseDouble(column.text());
break;
case 8:
stat8 = Double.parseDouble(column.text());
break;

Answer

This should get you started. Each table has a blank record you will have to account for. You will also have to figure out which stats you want and where they are in the table. You get the stats with tds.get(). Let me know how it works for you.

    Document doc = Jsoup.connect("http://www.tsn.ca/fantasy_news/feature/?ID=49815").get();

    for (Element table : doc.select("div.storyStats").select("table")) {
        for (Element row : table.select("tr")) {
            Elements tds = row.select("td");
            if (tds.size() > 0) {
                System.out.println(tds.get(1).text() + ":" + tds.get(5).text());
            }
        }
    }
Comments