KingPey KingPey - 7 days ago 6
Python Question

Parsing options from selection

I am trying to get the following:

<select name="Detect" id="313" class="select" style="display: none;">
<option value="650" maxmad="15" maxpad="2" status="TRUE" context="24"> 5 </option>

<option value="660" maxmad="16" maxpad="2" status="TRUE" context="25"> 6 </option>


I want to scrape and print out the following, the 650 from "value", and the 15 from "maxmad" and print it like this (There is a lot of these options, and I want to print all of them):

650: 15
660: 16
670: 17
etc
etc


Here is what I've tried:

driver = webdriver.PhantomJS()
window = driver.set_window_size(1120, 550)
site = driver.get("www.website.com")
soup = BeautifulSoup(site, "html.parser")
for option in soup.find_all("option"):
print('id: {}, maxmad: {}'.format(option['id'], option.text))

Answer

Multiple errors here that are more difficult to find and try out because you download the page every time. You should first make a minimal HTML file that has the relevant code and then use that to test the parsing code.

Then the first thing you'll notice that you use site.find_all where you should use soup.find_all. After fixing that you'll find that option has no id, and that you should search for select to get the id:

from bs4 import BeautifulSoup

html_str = """
<html>
<body>
<select name="Detect" id="313" class="select" style="display: none;">
<option value="650" maxmad="15" maxpad="2" status="TRUE" context="24"> 5 </option>
<option value="660" maxmad="16" maxpad="2" status="TRUE" context="25"> 6 </option>
</body>
</html>
"""

soup = BeautifulSoup(html_str, "html.parser")
select = soup.select("select")[0]
for select in soup.select("select"):
    ident = select['id']
    for option in select.find_all("option"):
        print('value: {}, maxmad: {}'.format(option['value'], option['maxmad'])

which gives:

value: 650, maxmad:  15 
value: 660, maxmad:  16 
Comments