J.C. Diaz J.C. Diaz - 1 year ago 136
HTML Question

Scraping IMDB.com with beautifulsoup in python but can't get href from movie link

I'm trying to get the href link for a movie (ex: search Iron Man on IMDB) but I can't seem to get it. I keep getting "None" when I run the code but if I remove .get('href'), the code will return the entire line of html (including the link I want). I appreciate any help with this. Thanks!

from bs4 import BeautifulSoup
import requests
from urllib.parse import urljoin # For joining next page url with base url

search_terms = input("What movie do you want to know about?\n> ").split()

url = "http://www.imdb.com/find?ref_=nv_sr_fn&q=" + '+'.join(search_terms) + '&s=all'

def scrape_find_next_page(url):
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")

next_page = soup.find('td', 'result_text').get('href')

return next_page

next_page_url = scrape_find_next_page(url)

Answer Source

You are trying to get the href from td, which the attribute does not exist. You need to get the a tag that contains the href attribute

next_page = soup.find('td', 'result_text').find('a').get('href')

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download