Albert Albert - 9 months ago 44
HTML Question

How to get the url and the title from the <a> tags with beautifulSoup

I'm coding a script to get all the links from the divs with a class="pntc-txt" and after I want to get from the tags the href attribute and also the text between the Something. For after take that url and text and insert them in a database. I'll post the code that I've done so far:

import urllib.request
from bs4 import *

sock = urllib.request.urlopen("")
htmlSource =

soup = BeautifulSoup(htmlSource)

for div in soup.findAll('div', {'class': 'pntc-txt'}):
a = div.findAll('a')
print (a)

Answer Source

Try this:

import requests
from bs4 import *

srcCode = requests.get("")
plainText = srcCode.text

soup = BeautifulSoup(plainText)

for div in soup.findAll('div', {'class': 'pntc-txt'}):
    for each in div.findAll('a'):      #get all elements with 'a' tag
        href = each.get('href')
        print href          #print href
        print each.string   #print the text in tags
        print each          #print whole tag

Note: also removed the urllib part to read the html page. Instead used package requests