Jackson Cunningham Jackson Cunningham - 1 year ago 106
Ruby Question

How to get all elements via CSS class

I am trying to scrape this page using Nokogiri to get all the elements with class name of "teaser".

If I check the page with jQuery, I can see there are 25 elements:

$(".teaser").length => 25

However, when using Nokogiri, I only get the first teaser:

teasers = doc.css('.teaser')
teasers.count => 1

Where am I going wrong? How do I get all the teasers?

Answer Source

That document appears to have a load of null bytes in it for some reason, and this is causing Nokogiri/LibXML to assume the document has finished part way through.

You should be able to fix it by preprocessing the contents to remove the nulls. If page contains the text of the webpage:

page.gsub! /\x00/, ''

Then use Nokogiri on page as before.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download