Kunal Malik Kunal Malik - 2 months ago 9
Python Question

find the CSS path (ancestor tags) in HTML using python

I want to get all the ancestor div tags where I match a text. So for example if the html looks like HTML snippet

And i'm searching for "Earl E. Byrd". I wanna get a list which contains {"buyer-info","buyer-name"}

This is what i did

r=requests.get(self.url,verify='/path/to/certfile')
soup = BeautifulSoup(r.text,"lxml")
divTags = soup.find_all('div')


How should I proceed ?

Answer

If you want to search for the div by text and get all the previous divs that have title attributes, first find the div using the text, then use find_all_previous setting title=True

soup = BeautifulSoup(r.text,"lxml")
div = soup.find('div', text="Earl E. Byrd")

print([div["title"]] + [d["title"] for d in div.find_all_previous("div", title=True)])