lclankyo lclankyo - 7 months ago 18
Python Question

Python BeautifulSoup find element that contains text

<div class="info">
<h3> Height:
<span>1.1</span>
</h3>
</div>

<div class="info">
<h3> Number:
<span>111111111</span>
</h3>
</div>


This is a partial portion of the site. Ultimately, I want to extract the 111111111. I know I can do
soup.find_all("div", { "class" : "info" })

to get a list of both divs; however, I would prefer to not have to perform a loop to check if it contains the text "Number".

Is there a more elegant way to extract "1111111" so that it does
soup.find_all("div", { "class" : "info" })
, but also makes it so that it MUST contain "Number" within?

I also tried
numberSoup = soup.find('h3', text='Number')

but it returns
None

Answer

Use xpath contains:

root.xpath('//div/h3[contains(text(), "Number")]/span/text()')