Simon Breton Simon Breton - 1 month ago 4
Python Question

error handling with BeautifulSoup when scraped url doesn't respond

I'm totally noob to python so please forgive my mistake and lack of vocabulary. I'm trying to scrap some url with BeautifulSoup. My url are coming from a GA api call and some of them doesn't respond.

How do I build my script so that BeautifulSoup ignore the url that doesn't return anything ?

Here is my code :

if results:
for row in results.get('rows'):
rawdata.append(row[0])
else:
print 'No results found'

urllist = [mystring + x for x in rawdata]

for row in urllist[4:8]:

page = urllib2.urlopen(row)
soup = BeautifulSoup(page, 'html.parser')


name_box = soup.find(attrs={'class': 'nb-shares'})
share = name_box.text.strip()

# save the data in tuple
sharelist.append((row,share))

print(sharelist)


I tried to use this :

except Exception:
pass


but I don't know where and got some syntax error. I've look at other questions, but cannot find any answers for me.

Answer

You may check the value of name_box variable - it would be None if nothing found:

for row in urllist[4:8]:  
    page = urllib2.urlopen(row)
    soup = BeautifulSoup(page, 'html.parser')

    name_box = soup.find(attrs={'class': 'nb-shares'})
    if name_box is None:
        continue

    # ...
Comments