Sindu_ Sindu_ - 7 months ago 38
HTML Question

Extract text in a order using jsoup

I want to extract the text inside the "job title" and the text inside "summary" class. There are many with the same class names. So I want the job title of the first one and summary of it. And then the job title of the next one and the summary of it. In that order.

The following code works. But it first gives all the titles and then all the text inside all the summary classes. I want the first job title and the first summary. Then the second job title and the second summary and so on. How do I modify the code for this? Please help.

<div class=" row result" id="p_64c5268586001bd2" data-jk="64c5268586001bd2" itemscope="" itemtype="http://schema.org/JobPosting" data-tn-component="organicJob">
<h2 id="jl_64c5268586001bd2" class="jobtitle">
<a rel="nofollow" href="/rc/clk?jk=64c5268586001bd2" target="_blank" onmousedown="return rclk(this,jobmap[0],0);" onclick="return rclk(this,jobmap[0],true,0);" itemprop="title" title="Fashion Assistant" class="turnstileLink" data-tn-element="jobTitle"><b>Fashion</b> Assistant</a>
</h2>
<span class="company" itemprop="hiringOrganization" itemtype="http://schema.org/Organization">
<span itemprop="name">
<a href="/cmp/Itv?from=SERP&amp;campaignid=serp-linkcompanyname&amp;fromjk=64c5268586001bd2&amp;jcid=3bf3e8a57da58ff5" target="_blank">
ITV Jobs</a></span>
</span>

<a data-tn-element="reviewStars" data-tn-variant="cmplinktst2" class="turnstileLink " href="/cmp/Itv/reviews?jcid=3bf3e8a57da58ff5" title="Itv Jobs reviews" onmousedown="this.href = appendParamsOnce(this.href, '?campaignid=cmplinktst2&amp;from=SERP&amp;jt=Fashion+Assistant&amp;fromjk=64c5268586001bd2');" target="_blank">
<span class="ratings"><span class="rating" style="width:49.5px;"><!-- -> </span></span><span class="slNoUnderline">28 reviews</span></a>
<span itemprop="jobLocation" itemscope="" itemtype="http://schema.org/Place"> <span class="location" itemprop="address" itemscope="" itemtype="http://schema.org/Postaladdress"><span itemprop="addressLocality">London</span></span></span>
<table cellpadding="0" cellspacing="0" border="0">
<tbody><tr>
<td class="snip">
<div>
<span class="summary" itemprop="description">
Do you have a passion for <b>Fashion</b>? You will be responsible for running our <b>fashion</b> cupboard, managing a team of interns and liaising with press officers to...</span>
</div>





doc = Jsoup.connect("http://www.indeed.co.uk/jobs?q=fashion&l=England").timeout(5000).get();
Elements f = doc.select(".jobtitle");
Elements e = doc.select(".summary");
System.out.println("Title: " + f.text());
System.out.println("Details: "+ e.text());

Answer

Iterate over titles and then find the summary for each title:

for (Element title : doc.select(".jobtitle")) {
    Element summary = title.parent().select(".summary").first();

    System.out.format("Title: %s. Summary: %s%n", title.text(), summary.text());
}