Javier Javier - 9 months ago 23
Ruby Question

How to put a group of <p> inside a <div>

I'd like to figure out a way to get to the HTML result (mentioned further below) by using the following Ruby code and Nokogiri:

require 'rubygems'
require 'nokogiri'

value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='1'>A</p>
<p id='2'>B</p>
<h1>Bla</h1>
<p id='3'>C</p>
<p id='4'>D</p>
<p id='5'>E</p>
</body>
</html>"
HTML_END

# The selected-array is given by the application.
# It consists of a sorted array with all ids of
# <p> that need to be enclosed by the <div>
selected = ["2","3","4"]
first_p = selected.first
last_p = selected.last

#
# WHAT RUBY CODE DO I NEED TO INSERT HERE TO GET
# THE RESULTING HTML AS SEEN BELOW?
#


The resulting HTML should look like this (please note the inserted
<div id='XYZ'>
):

<html>
<body>
<p id='1'>A</p>
<div id='XYZ'>
<p id='2'>B</p>
<h1>Bla</h1>
<p id='3'>C</p>
<p id='4'>D</p>
</div>
<p id='5'>E</p>
</body>
</html>

Answer Source

This is the working solution I've implemented into my project (Vlad@SO & Whitelist@irc#rubyonrails: Thanks for your help and inspiration.):

require 'rubygems'
require 'nokogiri'

value = Nokogiri::HTML.parse(<<-HTML_END)
  "<html>
    <body>
      <p id='1'>A</p>
      <p id='2'>B</p>
      <h1>Bla</h1>
      <p id='3'>C</p>
      <p id='4'>D</p>
      <p id='5'>E</p>
    </body>
  </html>"
HTML_END

# The selected-array is given by the application.
# It consists of a sorted array with all ids of 
# <p> that need to be enclosed by the <div>
selected = ["2","3","4"]

# We want an elements, not nodesets!
# .first returns Nokogiri::XML::Element instead of Nokogiri::XML::nodeset
first_p = value.css("p##{selected.first}").first
last_p = value.css("p##{selected.last}").first
parent = value.css('body').first

# build and set new div_node
div_node = Nokogiri::XML::Node.new('div', value)
div_node['class'] = 'XYZ'

# add div_node before first_p
first_p.add_previous_sibling(div_node)

selected_node = false

parent.children.each do |tag|
  # if it's the first_p
  selected_node = true if selected.include? tag['id']
  # if it's anything between the first_p and the last_p
  div_node.add_child(tag) if selected_node
  # if it's the last_p
  selected_node = false if selected.last == tag['id']
end

puts value.to_html