Giangio Giangio - 10 months ago 78
Python Question

Scraping: cannot access information from web

I am scraping some information from this url:

Everything was fine till I scraped the description.
I tried and tried to scrape, but I failed so far.
It seems like I can't reach that information. Here is my code:

html = urllib.urlopen("")
tree=BeautifulSoup(html, "lxml")

Any of you has any suggestion?


You would need to make an additional request to get the description. Here is a complete working example using requests + BeautifulSoup:

import requests
from bs4 import BeautifulSoup

url = ""
with requests.Session() as session:
    session.headers = {
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"

    # get the token
    response = session.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    token = soup.find("meta", {"name": "csrf-token"})["content"]

    # get the description
    description_url = url + "description"
    response = session.get(description_url, headers={"X-CSRF-Token": token, "X-Requested-With": "XMLHttpRequest"})

    soup = BeautifulSoup(response.content, "html.parser")
    description = soup.find('div', {'id':'description_section', 'class': 'description-section'})