Daniel Daniel - 2 months ago 17
Python Question

Nested For Loop with Unequal Entities

I would like to scrape the contents of a website with a similar structure to

https://www.wellstar.org/locations/pages/default.aspx

Using the provided website as a framework, I would like to extract the location's name and the heading associated with that location. I want to be able to produce the following:

WellStar Hospitals

WELLSTAR ATLANTA MEDICAL CENTER

WellStar Hospitals

WELLSTAR ATLANTA MEDICAL CENTER SOUTH

...

WellStar Health Parks

ACWORTH HEALTH PARK

...

Thus far I have attempted a nested for loop:

for type in soup.find_all("h3",class_="WebFont SpotBodyGreen"):
for name in soup.find_all("div",class_="PurpleBackgroundHeading"):
print(type.text, name.text)


The above
for loop
returns duplicates due to each name being paired with each type regardless of presentation on the website. Any help whether in the form of code and/or recommended resources for dealing with this task would be greatly appreciated.

Answer

You need a way to group the locations by name. For this, we separate each block, get the title and locations collected into a dictionary:

from pprint import pprint

import requests
from bs4 import BeautifulSoup

url = "https://www.wellstar.org/locations/pages/default.aspx"
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

d = {}
for row in soup.select(".WS_Content > .WS_LeftContent > table > tr"):
    title = row.h3.get_text(strip=True)

    d[title] = [item.get_text(strip=True) for item in row.select(".PurpleBackgroundHeading a")]

pprint(d)

Prints (pretty-printed with pprint()):

{'WellStar Community Hospice': ['Tranquility at Cobb Hospital',
                                'Tranquility at Kennesaw Mountain'],
 'WellStar Health Parks': ['Acworth Health Park', 'East Cobb Health Park'],
 'WellStar Hospitals': ['WellStar Atlanta Medical Center',
                        'WellStar Atlanta Medical Center South',
                        'WellStar Cobb Hospital',
                        'WellStar Douglas Hospital',
                        'WellStar Kennestone Hospital',
                        'WellStar North Fulton Hospital',
                        'WellStar Paulding Hospital',
                        'WellStar Spalding Regional Hospital',
                        'WellStar Sylvan Grove Hospital',
                        'WellStar West Georgia Medical Center',
                        'WellStar Windy Hill Hospital'],
 'WellStar Urgent Care Centers': ['WellStar Urgent Care in Acworth',
                                  'WellStar Urgent Care in Kennesaw',
                                  'WellStar Urgent Care in Marietta - Delk '
                                  'Road',
                                  'WellStar Urgent Care in Marietta - East '
                                  'Cobb',
                                  'WellStar Urgent Care in Marietta - '
                                  'Kennestone',
                                  'WellStar Urgent Care in Marietta - Sandy '
                                  'Plains Road',
                                  'WellStar Urgent Care in Smyrna',
                                  'WellStar Urgent Care in Woodstock']}