JSON Question

Web Scraping specific page with Python

Recently I've been learning web scraping with Python and Beautiful Soup. However I've hit a bit of a bump when trying to scrape the following page:


The data I want from the page is the tags for the book but I can't find any way to get the data despite spending a lot of time trawling the internet.

I tried following a few guides online but none of them seemed to work. I tried converting the page to XML and JSON but I still couldn't find the data.

Pretty stumped at the moment and I'd appreciate any help.


Answer Source

After analyzing the HTML and scripts, the tags are loaded through AJAX and requesting the AJAX url makes our life easy. Here is the python script.

import requests
from bs4 import BeautifulSoup

content = requests.get("http://www.librarything.com/ajax_work_makeworkCloud.php?work=3203347&check=2801929225").text
soup = BeautifulSoup(content)

for tag in soup.find_all('a'):
