Peter Cui Peter Cui - 4 years ago 79
Python Question

Python and Selenium: how to pull the data from the web text which has no id, class?

I had an website to pull information from.
For example, http://www.worldhospitaldirectory.com/alaska-native-medical-center/info/8500

I need to pull the information and save into CSV file. For example,

Category: General Hospitals

Name: Alaska Native Medical Center

Address: 4315 Diplomacy Drive

Phone: (907) 563-2662

City: Anchorage

State: Alaska

But the problem now is that I cannot locate these information.
The web code is as below:

<b>Category:</b>
General Hospitals
<br>
<b>Address:</b>
4315 Diplomacy Drive
<br>
<b>Subcontinent and Continent:</b>
North America, America
<br>


Please give me some suggestions or code to help me get those text.

Answer Source
import requests, bs4

r = requests.get('http://www.worldhospitaldirectory.com/alaska-native-medical-center/info/8500')
soup = bs4.BeautifulSoup(r.text, 'lxml')
start = soup.find('em')

for b in start.find_next_siblings('b'):

        print(b.text, b.next_sibling.strip())

out:

Category: General Hospitals
Address: 4315 Diplomacy Drive
Subcontinent and Continent: North America            , 
            America
Country: United States
Phone (907) 563-2662
Website: 
City:  
State:  
Email: 
Latitude: 61.1827
Longitude: -149.80009
Zip Code: 99508
Contact Address: 4315 Diplomacy Dr, Anchorage, AK 99508, United States
Latitude in Degree, Minute, Second [Direction]: 61° 10' 57" N
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download