Kojer Defor Kojer Defor - 1 year ago 82
Java Question

Get concrete URL with Jsoup

I'm trying figuring out how to separate useless information from link with

Bunch of code which I should parse here:


public class TestSoup {
public static void main (String[] args) throws Exception {
Document doc = Jsoup.connect("https://vk.com/smcat").get();
Elements links;
//links = doc.select("div > a > img ");
links = doc.select("[data-src_big]");


My output now:

<img src="https://pp.vk.me/c636126/v636126727/35e1b/ludjlj7T4i8.jpg" class="ph_img" data-id="-23530818_436648332" data-src_big="https://pp.vk.me/c636126/v636126727/35e1c/a1IyGrtjzUQ.jpg|600|448">

Can someone explain how I can extract second link from my output? Many thanks.

Answer Source

data-src_big is attribute and each element can have its own value for it.

To iterate over link elements you can use

for (Element el : links){

To get value for specified attribute from element you can use


If value of attribute is URL address written as relative path like./foo/bar.jpg but you want to get it as absolute path like http://server.com/foo/bar.jpg you can use

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download