HTML Question

BeautifulSoup - get h2 text without class

My code:

<div id="title">
My title <span class="subtitle">My Subtitle</span></h2></div>

If I use this code:

title = soup.find('div', id="title").h2.text
print title
>> My title My Subtitle

It matches everything. I want to match My title and My Subtitle as 2 different objects:

print title
>> My title
print subtitle
>> My subtitle

Any help?

Answer Source

One way to do it without using the class attribute is:

h2 = soup.find('div', id="title").h2
subtitle = h2.span.text
title = str(h2.contents[0])

The h2.contents[0] will return a NavigableString object here. Its behavior for print is same as that as the string version of it. If you're only going to use the print statement with it, then the str() call won't be necessary.

