TheMetaHorde TheMetaHorde - 11 months ago 36
Python Question

How to use python to interpret a url

I'm writing code that is attempting to extract the text from the Library of Babel.

They basically use a system of Hexes, Walls, Shelfs, Volumes and Pages to split up their library of randomly generated text files. Here is an example (
Here we have Hex: 2, Wall: 1, Shelf: 2, Volume: 22, Page: 1.

I would ideally like to randomly generate a page across all these variables to extract text from, however I am not getting the output I would imagine.

Here is my code:

import requests
from bs4 import BeautifulSoup
from urlparse import urlparse
import random

hex = str(random.randint(0, 6))
wall = str(random.randint(1, 4))
shelf = str(random.randint(1, 5))
vol = str(random.randint(1, 32))
page = str(random.randint(1, 410))

print("Fetching: " + " Hex: " + hex + ", Wall: " + wall + ", Shelf: " + shelf + ", Vol: " + vol + ", Page: " + page)
babel_url = str("" + hex + "-w" + wall + "-s" + shelf + "-v" + vol + ":" + page)
r = requests.get(babel_url)
soup = BeautifulSoup(r.text)

My output would be identical to that if I changed the url to be print(babel_url) shows me that the way I wrote the url is fine but something isn't interpreting what I have written in the way I want.

I've found that just pasting into chrome drops me at But if I navigate to (or any other page) I can move between pages at will.

The only thing I get in the output worth mentioning is:

It appears your browser has javascript disabled. Follow this link to browse without javascript.

Answer Source

Put on you glasses :
You are requesting browse.cgi instead of book.cgi
instead of