Melanie Shebel Melanie Shebel -4 years ago 139
Ruby Question

Pulling a URL from a feed using Nokogiri

Let's say I have this in a document:

<entry>
<link rel="replies" type="application/atom+xml" href="http://www.url.com/feeds/1/comments/default" title="Comments"/>
<link rel="alternate" type="text/html" href="http://www.url.com/a_blog_post.html" title="A Blog Post"/>
</entry>

<entry>
<link rel="replies" type="application/atom+xml" href="http://www.url.com/feeds/2/comments/default" title="Comments"/>
<link rel="alternate" type="text/html" href="http://www.url.com/another_blog_post.html" title="Another Blog Post"/>
</entry>


I am trying to use Nokogiri to pull the urls for each of the blog posts, but I am apparently going about it all wrong (I'm new to programming and having trouble understanding nokogiri)

Here's what I have:

require 'nokogiri'
require 'open-uri'

def get_posts(url)
posts = []
doc = Nokogiri::HTML(open(url))
doc.css('entry.alternate').each do |e|
puts e['href']
posts << e['href']
end
return posts
end

puts "Enter feed url:"
url = gets.chomp
posts = get_posts(url)
puts posts.to_s


Any help would be great! I started this little thing to better learn to program, but I'm stuck. My output currently is
[]

Answer Source

Your CSS selector is wrong, entry.alternate would select all entry elements with alternate class (that is something like <entry class="alternate" />).

I suppose you want to select all link elements that have rel attribute with value of alternate. CSS selector for this is link[rel=alternate]. So change your code like this:

doc.css('link[rel=alternate]').each do |e|
  puts e['href']
  posts << e['href']
end

You can read more about CSS selectors here: http://www.w3.org/TR/CSS2/selector.html.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download