Recently I've been learning web scraping with Python and Beautiful Soup. However I've hit a bit of a bump when trying to scrape the following page:
The data I want from the page is the tags for the book but I can't find any way to get the data despite spending a lot of time trawling the internet.
I tried following a few guides online but none of them seemed to work. I tried converting the page to XML and JSON but I still couldn't find the data.
Pretty stumped at the moment and I'd appreciate any help.
After analyzing the HTML and scripts, the tags are loaded through AJAX and requesting the AJAX url makes our life easy. Here is the python script.
import requests from bs4 import BeautifulSoup content = requests.get("http://www.librarything.com/ajax_work_makeworkCloud.php?work=3203347&check=2801929225").text soup = BeautifulSoup(content) for tag in soup.find_all('a'): print(tag)