Adam Warner Adam Warner - 6 months ago 16
Python Question

Web Scrape in Python

So I am trying to web scrape https://en.wikipedia.org/wiki/FIFA_World_Rankings and scrape the first table on the page, but it has not worked and I get an error 'NoneType' object is callable.

Here is my code:

from bs4 import BeautifulSoup
import urllib2

soup = BeautifulSoup(urllib2.urlopen("https://en.wikipedia.org/wiki/FIFA_World_Rankings").read())

for row in soup('table', {'class': 'wikitable'})[0].tbody('tr'):
tds = row('td')
print tds[0].string, tds[1].string


I don't know much about HTML and I know very little about web scraping.

Answer

This should work. You need to use find_all to look for tags. Also, in the Wiki article, team ranks are present in table rows 3-22, hence the if condition.

from bs4 import BeautifulSoup
import urllib2

soup = BeautifulSoup(urllib2.urlopen("https://en.wikipedia.org/wiki/FIFA_World_Rankings").read())

for i,row in enumerate(soup('table', {'class': 'wikitable'})[0].find_all('tr')):
    if i > 2 and i < 23:
      data = row.find_all('td')
      print i,data[0].text, data[1].text