Hyperion Hyperion - 2 years ago 86
Python Question

python/beautifulsoup - find multiple anchors inside class

I have a page structured like this:

<div class="multiple_links">
<a href="http://www.example.org/link1"> link1 </a>
<a href="http://www.example.org/link2"> link2 </a>
<a href="http://www.example.org/link3"> link3 </a>

<div class="multiple_links">
<a href="http://www.example.org/link4"> link4 </a>
<a href="http://www.example.org/link5"> link5 </a>
<a href="http://www.example.org/link6"> link6 </a>

I want to extract the 3rd link of every class. I've tried using this:

urls = soup.findAll('div', {'class':'multiple_links'})
for element in urls:
url = element.find('a', href=True)
print url['href']
>> http://www.example.org/link1
>> http://www.example.org/link4

But it finds only the first anchor of the class. I need as output:

>> http://www.example.org/link3
>> http://www.example.org/link6

Any ideas?

Answer Source

You should use urls = element.findAll('a', href=True) (like you used for find the classes)

urls will then contain 3 elements each time.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download