345243lkj 345243lkj - 29 days ago 6
Python Question

How to extract certain strings when they occur adjacently with BeautifulSoup

I'm parsing an HTML page's result from BeautifulSoup and the part(s) I'm interested in looks like this:

<i class="fa fa-circle align-middle font-80" style="color: #45C414; margin-right: 15px"></i>Departure for <a href="/en/ais/details/ports/17787/port_name:TEKIRDAG/_:3525d580eade08cfdb72083b248185a9" title="View details for: TEKIRDAG">TEKIRDAG</a> </td>

I'm interested in extracting the
, TEKIRDAG, however there are many port name's that are labeled identically. My question is is there a way to only extract
if it occers after the string
'Departure for'


You can locate the text node and get the next sibling:

In [1]: from bs4 import BeautifulSoup

In [2]: data = """<i class="fa fa-circle align-middle font-80" style="color: #45C414; margin-right: 15px"></i>Departu
   ...: re for <a href="/en/ais/details/ports/17787/port_name:TEKIRDAG/_:3525d580eade08cfdb72083b248185a9" title="Vie
   ...: w details for: TEKIRDAG">TEKIRDAG</a> </td>"""

In [3]: soup = BeautifulSoup(data, "html.parser")

In [4]: soup.find(text="Departure for ").next_sibling.get_text()
Out[4]: u'TEKIRDAG'