firedude144 firedude144 - 1 month ago 18
Ruby Question

Extracting url with capybara

I have a page with multiple entries of student names, each student having an url that leads to his chart. The html looks like this:

<div class="student_name">
<a target="_blank" data-tn-element="grade-result-link[]" data-tn-link
href="/johndoe/b89db3308ddaaed2?sp=0" rel="nofollow" class="student_link"
itemprop="url">John Doe</a>
<span class="graduated"> - Graduated 2013</span>
</div>


I want to create a list with only the urls of each student on the page but all I end up with is the name of the student. I'm using capybara with webkit and my code resembles this:

results = page.all('div.student_name').map do |item|
puts(item.text)
end


How do I phrase this so I can only extract the embedded (relative) url in the href?

Ed

Answer
urls = page.all('div.student_name a', minimum: 1).map do |link]
  link[:href]
end

should get you the urls. The minimum:1 would just make the all wait until at least one instance is on the page and may not be needed in your particular instance. Depending on the driver you're using they may be full normalized urls but stripping the domain off them isn't hard if you really need relative.