CSJ CSJ - 2 months ago 8x
Python Question

use django to serve downloading big zip file with some data appended

I have a views snippet like below, which get a zip filename form a request, and I want to append some string

after the end of zip file

def download(request):
... skip
response = HttpResponse(readFile(abs_path, sign), content_type='application/zip')
response['Content-Length'] = os.path.getsize(abs_path) + len(sign)
response['Content-Disposition'] = 'attachment; filename=%s' % filename
return response

and the
function as below:

def readFile(fn, sign, buf_size=1024<<5):
f = open(fn, "rb")
logger.debug("started reading %s" % fn)
while True:
c = f.read(buf_size)
if c:
yield c
logger.debug("finished reading %s" % fn)
yield sign

It works fine when using
mode, but failed on big zip file when I use
uwsgi + nginx
apache + mod_wsgi

It seems timeout because need too long time to read a big file.

I don't understand why I use
but the browser start to download after whole file read finished.(Because I see the browser wait until the log
finished reading %s

Shouldn't it start to download right after the first chunk read?

Is any better way to serve a file downloading function that I need to append a dynamic string after the file?


Django doesn't allow streaming responses by default so it buffers the entire response. If it didn't, middlewares couldn't function the way they do right now.

To get the behaviour you are looking for you need to use the StreamingHttpResponse instead.

Usage example from the docs:

import csv

from django.utils.six.moves import range
from django.http import StreamingHttpResponse

class Echo(object):
    """An object that implements just the write method of the file-like
    def write(self, value):
        """Write the value by returning it, instead of storing in a buffer."""
        return value

def some_streaming_csv_view(request):
    """A view that streams a large CSV file."""
    # Generate a sequence of rows. The range is based on the maximum number of
    # rows that can be handled by a single sheet in most spreadsheet
    # applications.
    rows = (["Row {}".format(idx), str(idx)] for idx in range(65536))
    pseudo_buffer = Echo()
    writer = csv.writer(pseudo_buffer)
    response = StreamingHttpResponse((writer.writerow(row) for row in rows),
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    return response