forest forest - 6 months ago 15
Python Question

Python: What is returned when I use requests.get('url') and print r.text?

I'm trying to scrape this webpage. This code works:

import requests
header = {
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/32.0',
}
r = requests.get('http://www.machinefinder.com/ww/en-US/categories/used-drawn-planters', headers=header)
print r.text


but I'm not sure what the text that it returns really is. I wish it was JSON so that I could copy other examples I've found that parse JSON.

Note: my work security blocks the webpage and says "Illegal Web Browser" when I use

header={
'Content-Type': 'application/json;charset=UTF-8',
}


which is why I'm using Firefox instead.

Answer
>>>>type(r.text) 
<type 'unicode'>

Looks to be the html for the page. You could use Beautiful soup to parse it :https://www.crummy.com/software/BeautifulSoup/bs3/documentation.html

Comments