st4rgut st4rgut - 10 months ago 50
Python Question

How to find links within a specified class with Beautiful Soup

I'm using Beautiful Soup 4 to parse a news site for links contained in the body text. I was able to find all the paragraphs that contained the links but the

returned type
for each link. I'm using Python 3.5.1. Any help is really appreciated.

from bs4 import BeautifulSoup
import urllib.request
import re

soup = BeautifulSoup("", "html.parser")

for paragraph in soup.find_all("div", class_="zn-body__paragraph"):

Answer Source

Do you really want this?

for paragraph in soup.find_all("div", class_="zn-body__paragraph"):
    for a in paragraph("a"):

Note that paragraph.get('href') tries to find attribute href in <div> tag you found. As there's no such attribute, it returns None. Most probably you actually have to find all tags <a> which a descendants of your <div> (this can be done with paragraph("a") which is a shortcut for paragraph.find_all("a") and then for every element <a> look at their href attribute.