Billjk Billjk - 3 months ago 26
Python Question

How To Capture Output of Curl from Python script

I want to find the info about a webpage using curl, but in Python, so far I have this:

os.system("curl --head www.google.com")


If I run that, it prints out:

HTTP/1.1 200 OK
Date: Sun, 15 Apr 2012 00:50:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=3e39ad65c9fa03f3:FF=0:TM=1334451013:LM=1334451013:S=IyFnmKZh0Ck4xfJ4; expires=Tue, 15-Apr-2014 00:50:13 GMT; path=/; domain=.google.com
Set-Cookie: NID=58=Giz8e5-6p4cDNmx9j9QLwCbqhRksc907LDDO6WYeeV-hRbugTLTLvyjswf6Vk1xd6FPAGi8VOPaJVXm14TBm-0Seu1_331zS6gPHfFp4u4rRkXtSR9Un0hg-smEqByZO; expires=Mon, 15-Oct-2012 00:50:13 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked


What I want to do, is be able to match the 200 in it using a regex (i don't need help with that), but, I can't find a way to convert all the text above into a string. How do I do that?
I tried:
info = os.system("curl --head www.google.com")
but
info
was just
0
.

Answer

Try this:

import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("GET", "/index.html")
r1 = conn.getresponse()
print r1.status, r1.reason
Comments