ollaollu ollaollu - 1 month ago 9
Ruby Question

how to extract text from commented html tag with nokogiri

I have a page I've parsed with Nokogiri but I need to get the text from a commented tag. The html is below:

<div class="parent">
<div class="child">
<span class="visible"> hello </span>
<!-- <span class="commented"> hi </span> -->
</div>
</div>


assuming I have the page as a Nokogiri
page
object this is what I've tried but it gives me
0
:

page.xpath("//div[@class='parent']/div[@class='child']/comment()").each {|comment| comment.text }


Running only
page.xpath("//div[@class='parent']/div[@class='child']/comment()")
gives:
[#<Nokogiri::XML::Comment:0x3fe466d8d634 " <span class=\"commented\">hi </span> ">]


I'm out of ideas on trying how to fetch the
hi
text

Answer

I'm not a Nokogiri expert but something like this seems to work

comment_node = Nokogiri::HTML(page.at("//div[@class='parent']/div[@class='child']/comment()").text)
comment_node.text.strip
 => "hi"