I have html as follows:
text data here
continued text data
text & data I want to omit
textdata= soup.find('div', class_='maindiv').get_text()
textdata = soup.find('div', class_='maindiv').get_text(recursive=False)
Not that far from your subtracting method, but one way to do it (at least in Python 3) is to discard all child divs.
s = soup.find('div', class_='maindiv') for child in s.find_all("div"): child.decompose() print(s.get_text())
Would print something like:
text data here continued text data
That might be a bit more efficient and flexible than subtracting the strings, though it still needs to go through the children first.