Meghla Khan Meghla Khan - 1 year ago 93
HTML Question

Parsing HTML href attribute

I'm working on a project where i need to parse HTML for extracting data from a webpage. I'm using Jsoup in Java. I need to extract data from the following contents.

<td><small><a href=";p1=248" target="_blank">2016/08/21 21:00</a></small></td>
<td><small><a href="">AtCoder Grand Contest 003</a></small></td>


I can get the value Contest name and Time but how to extract the URL. I want to get the contest URL

how to get this ??

Here's my code

private void getAC() throws IOException {

Document doc = Jsoup.connect("").userAgent(Desktop.getDesktop().toString()).get();
Element table = doc.getElementsByClass("table-responsive").get(1);
Elements contestStartTime = table.getElementsByTag("td");
int cnt = 1;
for (Element i : contestStartTime) {
System.out.println(cnt + ". " + i.html());


Answer Source

JSoup have rich api for DOM processing, look for this functions:

Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
  String linkHref = link.attr("href");
  String linkText = link.text();

Also you can get your links this way

Elements links ="table a[href]");
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download