User124235 User124235 - 3 months ago 19
HTML Question

Jsoup Java get a specific td

I have the following code

import java.io.IOException;
import java.util.*;

import org.jsoup.*;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.*;
public class da {

/**
* @param args
*/
public static void main(String[] args) {
try {


Document doc=Jsoup.connect("http://www.vremea.net/").get();
Elements e=doc.select(".homeContent ul li a ");
PrintStream ps=new PrintStream(new FileOutputStream("io"));
String rezultat="";
for(int i=0;i<e.size();i++)
if(e.get(i).attr("href").contains("Arad"))
rezultat=e.get(i).attr("href");

System.out.println(rezultat);

Document doc1=Jsoup.connect(rezultat).get();
Elements row=doc1.select(".tableforecast tr");
Elements nume=doc1.select("h1");
ArrayList<String> date=new ArrayList<String>();
ArrayList<String> numedate=new ArrayList<String>();

for(int q=1;q<nume.size();q++)
if(nume.get(q).text().contains("Vremea in"))
numedate.add(nume.get(q).text());
for(int i=0;i<row.size();i++)
{
Elements col=row.get(i).select("td");
String sir="";
int vr=0;
for(int j=0;j<col.size();j++)
if(col.get(j).className().equals("cell large"))
{sir=sir+" "+col.get(j).text();
vr=1;}
if(vr==1)
date.add(sir);

}
for(int i=0;i<numedate.size();i++){

for(int j=0;j<date.size();j=j+2)
ps.println(numedate.get(i)+"\n"+date.get(j)+"\n"+date.get(j+1));
}

} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}


}

}


This code gets in a table and for each row it gets a column that contains some string.I wonder if i can get those column direct without using contains and get all columns and then get from all those columns what i need and i was wondering what will select looks like if this is possible?

numedate- is the name of the day and date is the temperature and the hour.

Answer

You may try it this way:

go directly to the page where you want to extract data (In your case "Arad")

"http://www.vremea.net/Vremea-in-Arad-judetul-Arad/prognoza-meteo-pe-7-zile" look at the other pages. They seem to have somekind of structur like : /some text-in-place name-some text/some text

You can select the td elements in the class cell and in the class large directly as follows

public static void main (String [] args) throws IOException{        
    Document doc = Jsoup.connect("http://www.vremea.net/Vremea-in-Arad-judetul-Arad/prognoza-meteo-pe-7-zile").get();
    Elements tds = doc.select("table.tableforecast tbody tr td.cell.large");
    for (Element e : tds){
        System.out.println(e.text());
    }
}
Comments