user2950593 user2950593 - 7 months ago 9
Ruby Question

How to get text from <li> elements

I have:

<ul>
<li>text1</li>
<li>text2 </li>
</ul>


Right now I get the text from
<li>
like this:

result = page.css(' ul li').text


The problem is, as a result I get a string with no spaces like

text1text2


I want it to be divided with
<br>
, like
text1<br>text2<br>
.

How do I do this?

Answer

From "Searching a XML/HTML Document" :

methods xpath and css actually return a NodeSet, which acts very much like an array, and contains matching nodes from the document.

So, if you want to concatenate all texts from all <li> tags, then you should work with the css method result as with a collection:

page.css('ul li') # selects all li tags and returns collection of Node objects
    .map(&:text) # maps collection of li nodes into array of corresponding texts
    .join('<br>') # concatenates all nodes texts into a single string with <br> separator 

See: http://ruby.bastardsbook.com/chapters/html-parsing/