I'm writing code that is attempting to extract the text from the Library of Babel.
They basically use a system of Hexes, Walls, Shelfs, Volumes and Pages to split up their library of randomly generated text files. Here is an example (https://libraryofbabel.info/book.cgi?2-w1-s2-v22:1)
Here we have Hex: 2, Wall: 1, Shelf: 2, Volume: 22, Page: 1.
I would ideally like to randomly generate a page across all these variables to extract text from, however I am not getting the output I would imagine.
Here is my code:
from bs4 import BeautifulSoup
from urlparse import urlparse
hex = str(random.randint(0, 6))
wall = str(random.randint(1, 4))
shelf = str(random.randint(1, 5))
vol = str(random.randint(1, 32))
page = str(random.randint(1, 410))
print("Fetching: " + " Hex: " + hex + ", Wall: " + wall + ", Shelf: " + shelf + ", Vol: " + vol + ", Page: " + page)
babel_url = str("https://libraryofbabel.info/browse.cgi?" + hex + "-w" + wall + "-s" + shelf + "-v" + vol + ":" + page)
r = requests.get(babel_url)
soup = BeautifulSoup(r.text)
Put on you glasses :
You are requesting
browse.cgi instead of