vog vog - 1 month ago 16
HTTP Question

How to stream POST data into Python requests?

I'm using the Python

requests
library to send a POST request. The part of the program that produces the POST data can write into an arbitrary file-like object (output stream).

How can I make these two parts fit?

I would have expected that
requests
provides a streaming interface for this use case, but it seems it doesn't. It only accepts as
data
argument a file-like object from which it reads. It doesn't provide a file-like object into which I can write.

Is this a fundamental issue with the Python HTTP libraries?

Ideas so far:



It seems that the simplest solution is to
fork()
and to let the requests library communicate with the POST data producer throgh a pipe.

Is there a better way?

Alternatively, I could try to complicate the POST data producer. However, that one is parsing one XML stream (from stdin) and producing a new XML stream to used as POST data. Then I have the same problem in reverse: The XML serializer libraries want to write into a file-like object, I'm not aware of any possibility that an XML serializer provides a file-like object from which other can read.

I'm also aware that the cleanest, classic solution to this is coroutines, which are somewhat available in Python through generators (
yield
). The POST data could be streamed through (
yield
) instead of a file-like object and use a pull-parser.

However, is possible to make
requests
accept an iterator for POST data? And is there an XML serializer that can readily be used in combination with
yield
?

Or, are there any wrapper objects that turn writing into a file-like object into a generator, and/or provide a file-like object that wraps an iterator?

Answer

request does take an iterator or generator as data argument, the details are described in Chunk-Encoded Requests. The transfer encoding needs to be chunked in this case because the data size is not known beforehand.

Here is a very simle example that uses a queue.Queue and can be used as a file-like object for writing:

import requests
import queue
import threading

class WriteableQueue(queue.Queue):

    def write(self, data):
        # An empty string would be interpreted as EOF by the receiving server
        if data:
            self.put(data)

    def __iter__(self):
        return iter(self.get, None)

    def close(self):
        self.put(None)

# quesize can be limited in case producing is faster then streaming
q = WriteableQueue(100)

def post_request(iterable):
    r = requests.post("http://httpbin.org/post", data=iterable)
    print(r.text)

threading.Thread(target=post_request, args=(q,)).start()

# pass the queue to the serializer that writes to it ...    
q.write(b'1...')
q.write(b'2...')

# closing ends the request
q.close()