Why does the CSS selector return the correct info, but the XPath does not?
source = "<hgroup class='page-header channel post-head' data-channel='tech' data-section='sec0=tech&sec1=index&sec2='><h2>Tech</h2></hgroup>"
doc = Nokogiri::HTML(source)
=> [#<Nokogiri::XML::Element:0x6c2b824 name="h2" children=[#<Nokogiri::XML::Text:0x6c2b554 "Tech">]>]
case_insensitive_equals does what its name suggests, it is because the
class attribute isn’t equal to
post-head (case insensitively or not), but it does contain it. XPath treats
class attributes as plain strings, it doesn’t split them and handle the classes individually as CSS does.
A simple XPath that would work would be:
(I’ve removed the custom function, you will need to write your own to do this case insensitively.)
This isn’t quite the same though, as it will also match classes such as
not-post-head. A more complete XPath would be something like this:
doc.xpath('//hgroup[contains(concat(" ", normalize-space(@class), " "), " post-head ")]//h2')