Tim The Learner Tim The Learner - 1 year ago 64
Java Question

specific element scraping with jsoup.

What I would like to do is to grab a link, in this case a webm file, and store it in a string. The page I'm scraping is http://www.hearthpwn.com/cards/503-ragnaros-the-firelord and the link I want is on line 1010 when viewing page source. I'd like for this method to work on different pages so I don't want to scrape by line. If someone could give me a small example just to get started on how to scrape only the link associated with "data-animationurl=" that'd be great, thanks

Answer Source

You'll want to wrap this in an AsyncTask so your app doesn't hang, but this should give you a good start:

You can get more information about jsoup here.

try {
    //Connect to the url, and set the user agent so we don't get blocked out
    Connection connect = Jsoup.connect("http://www.hearthpwn.com/cards/503-ragnaros-the-firelord");

    //Get the html and select the first <video class="hscard-video" ...
    Document doc = connect.get();
    Element video = doc.select("video.hscard-video").first();

    //Grab all the data from it as a map (ex. data-href, data-usegold...)
    Map<String, String> dataSet = video.dataset();

    //If data-animationurl exists, print it (here you can store it as a String instead 
} catch (IOException e) {
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download