ollaollu ollaollu - 1 month ago 6
Ruby Question

How to extract text from commented HTML tag

I have a page I've parsed with Nokogiri but I need to get the text from a commented tag. The HTML is below:

<div class="parent">
<div class="child">
<span class="visible"> hello </span>
<!-- <span class="commented"> hi </span> -->
</div>
</div>


assuming I have the page as a Nokogiri
page
object this is what I've tried but it gives me
0
:

page.xpath("//div[@class='parent']/div[@class='child']/comment()").each {|comment| comment.text }


Running only:

page.xpath("//div[@class='parent']/div[@class='child']/comment()")


gives:

[#<Nokogiri::XML::Comment:0x3fe466d8d634 " <span class=\"commented\">hi </span> ">]


I'm out of ideas on trying how to fetch the
hi
text.

Answer

I'm not a Nokogiri expert but something like this seems to work

comment_node = Nokogiri::HTML(page.at("//div[@class='parent']/div[@class='child']/comment()").text)
comment_node.text.strip
 => "hi" 
Comments