I am trying to get some information from google finance but i seem to be getting this error
AttributeError: 'HTTPResponse' object has no attribute 'split'
here is my python code:
from bs4 import BeautifulSoup
symbolsfile = open("Stocklist.txt")
symbolslist = symbolsfile.read()
thesymbolslist = symbolslist.split("\n")
while i<len (thesymbolslist):
theurl = "http://www.google.com/finance/getprices?q=" + thesymbolslist[i] + "&i=10&p=25m&f=c"
thepage = urllib.request.urlopen (theurl)
print(thesymbolslist[i] + " price is " + thepage.split()[len(thepage.split())-1])
This is because
urllib.request.urlopen (theurl) returns an object representing the connection, not a string.
To read data from this connection and actually get a string, you need to do
thepage = urllib.request.urlopen(theurl).read()
and then the rest of your code should follow naturally.
Occasionally, the string itself contains an unrecognised character encoding glyph, in which case Python converts it into a bytestring.
The right approach to dealing with that is to find the correct character encoding and decode the bytestring into a regular string using it, as seen in this question:
thepage = urllib.request.urlopen(theurl) # read the correct character encoding from `Content-Type` request header charset_encoding = thepage.info().get_content_charset() # apply encoding thepage = thepage.read().decode(charset_encoding)
It is sometimes safe to make the assumption that the character encoding is
utf-8, in which case
thepage = urllib.request.urlopen(theurl).read().decode('utf-8')
does work more often than not. It's a statistically good guess if nothing else.