n_x_l n_x_l - 1 year ago 74
Ruby Question

How do I access HTML elements that are rendered in JavaScript using XPath?

How do I get a

with a specific class name using XPath and Nokogiri? Tables are nested and some of them don't have IDs or classes, so I can't nest stuff like this:


Here is what I have so far:

doc = Nokogiri::HTML(open("http://www.goalzz.com/default.aspx?c=8358"))
doc.xpath('//td[@class="m_g"]').each do |node|
pp node.to_s

Any ideas? There are few
s with that class name and I want to get all of them.

Answer Source

Using gem "capybara-webkit" is a viable way of manipulating this website in full javascript rendered view.

Here is a scratch example of what a capybara-webkit script might look like.

#!/usr/bin/env ruby
require "rubygems"
require "pp"
require "bundler/setup"
require "capybara"
require "capybara/dsl"
require "capybara-webkit"

Capybara.run_server = false
Capybara.current_driver = :webkit
Capybara.app_host = "http://www.goalzz.com/"

module Test
  class Goalzz
    include Capybara::DSL

    def get_results
      all(:xpath, '//td[@class="m_g"]').each { |node| pp node.to_s }


spider = Test::Goalzz.new

What is required to find the example xpath in this case (due to the page being created dynamically), is a fully functional javascript webdriving engine.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download