w. wells w. wells -4 years ago 110
HTTP Question

python asyncio connection gets incomplete http response

I was trying to get the content of website with python asyncio.

import asyncio
import urllib.parse

def get(url):
url = urllib.parse.urlsplit(url)
connect = asyncio.open_connection(url.hostname, 80)
reader, writer = yield from connect
request = ('HEAD {path} HTTP/1.1\r\n'
'Host: {hostname}\r\n'
'\r\n').format(path=url.path or '/', hostname=url.hostname)
response = yield from reader.read()

url = 'http://www.example.com'
loop = asyncio.get_event_loop()
tasks = asyncio.ensure_future(get(url))

It gets only the header, but no content!

b'HTTP/1.1 200 OK\r\nAccept-Ranges: bytes\r\nCache-Control: max-age=604800\r\nContent-Type: text/html\r\nDate: Sat, 25 Feb 2017 11:44:26 GMT\r\nEtag: "359670651+ident"\r\nExpires: Sat, 04 Mar 2017 11:44:26 GMT\r\nLast-Modified: Fri, 09 Aug 2013 23:54:35 GMT\r\nServer: ECS (rhv/818F)\r\nX-Cache: HIT\r\nContent-Length: 1270\r\n\r\n'

Answer Source

As stated by one of the comments, you are performing a HEAD request instead of a GET request: a HEAD request will only retrieve the headers, that's why you are only receiving those.

I've tested your code with GET instead of HEAD, and it works as you were expecting; but as an advice, I'd be moving to aiohttp, your entire code would be comprised into the one below, not only nicer looking, but also way faster:

import asyncio
import aiohttp

async def get(loop, url):
    async with aiohttp.request('GET', url, encoding='latin-1') as response:
        html = await response.text()

url = 'http://www.example.com'
loop = asyncio.get_event_loop()
loop.run_until_complete(get(loop, url))

NOTE: This is Python 3.5+ async/await style, but it can be easily translated into 3.4 with @asyncio.coroutine and yield from. Let me know if you have any issue doing it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download