windboy windboy - 10 months ago 42
JSON Question

Getting list of keywords from JSON

I have encountered a problem and I don't understand why it printed out this way.

Below is my code, please forgive me for the bad formatting as I am new to programming, this is to open a text file which has a bunch of keywords

import urllib2
import json

f1 = open('CatList.text')
lines = f1.readlines()

for line in lines:

url =''+line+'&cmlimit=100'


json_obj = urllib2.urlopen(url)
data = json.load(json_obj)

#to write the result
f2 = open('SubList.text', 'w')


for item in data['query']:

for i in data['query']['categorymembers']:


I get the error:

Traceback (most recent call last):
File "", line 16, in <module>
json_obj = urllib2.urlopen(url)
File "/usr/lib/python2.7/", line 127, in urlopen
return, data, timeout)
File "/usr/lib/python2.7/", line 402, in open
req = meth(req)
File "/usr/lib/python2.7/", line 1113, in do_request_
raise URLError('no host given')
urllib2.URLError: <urlopen error no host given>

I am not sure what this error means but I tried this to print the url.

import urllib2
import json

f1 = open('CatList.text')
f2 = open('SubList.text', 'w')
lines = f1.readlines()

for line in lines:

url =''+line+'&cmlimit=100'


The results I have gotten were weird (below is part of the result): of geography
&cmlimit=100 by place
&cmlimit=100 awards and competitions
&cmlimit=100 conferences
&cmlimit=100 education
&cmlimit=100 studies
&cmlimit=100 zones
&cmlimit=100 corridors
&cmlimit=100 of geography
&cmlimit=100 systems
&cmlimit=100 lists

Notice that the URL is broken up into 2 parts lists

instead of lists&cmlimit=100

My first question is how can I fix this?

Secondly, is this what that is giving me the error?

My CatList.text is as follows:

Category:Branches of geography
Category:Geography by place
Category:Geography awards and competitions
Category:Geography conferences
Category:Geography education
Category:Environmental studies
Category:Geographical zones
Category:Geopolitical corridors
Category:History of geography
Category:Land systems
Category:Geography-related lists
Category:Lists of countries by geography
Category:Geography organizations
Category:Geographical regions
Category:Geographical technology
Category:Geography terminology
Category:Works about geography
Category:Geographic images
Category:Geography stubs

Sorry for the long post. I really appreciate your help. Thank you.


Update the following line

url =''+line+'&cmlimit=100'  


url =''+line.strip()+'&cmlimit=100'  

Your line contains line-feed (\n) characters which will be removed using .strip() which removes white-spaces from both ends of a string.