nastyn8 nastyn8 - 5 months ago 189
Python Question

Send file via JSON instead of uploading to server, Django

I have an app that currently allows a user to upload a file and it saves the file on the web server. My client has now decided to use a third party cloud hosting service for their file storage needs. The company has their own API for doing CRUD operations on their server, so I wrote a script to test their API and it sends a file as a base64 encoded JSON payload to the API. The script works fine but now I'm stuck on how exactly how I should implement this functionality into Django.

json_testing.py

import base64
import json
import requests
import magic

filename = 'test.txt'

# Open file and read file and encode it as a base64 string
with open(filename, "rb") as test_file:
encoded_string = base64.b64encode(test_file.read())

# Get MIME type using magic module
mime = magic.Magic(mime=True)
mime_type = mime.from_file(filename)

# Concatenate MIME type and encoded string with string data
# Use .decode() on byte data for mime_type and encoded string
file_string = 'data:%s;base64,%s' % (mime_type.decode(), encoded_string.decode())
payload = {
"client_id": 1,
"file": file_string
}
headers = {
"token": "AuthTokenGoesHere",
"content-type": "application/json",
}
request = requests.post('https://api.website.com/api/files/', json=payload, headers=headers)
print(request.json())


models.py

def upload_location(instance, filename):
return '%s/documents/%s' % (instance.user.username, filename)

class Document(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
file = models.FileField(upload_to=upload_location)

def __str__(self):
return self.filename()

def filename(self):
return os.path.basename(self.file.name)


So to reiterate, when a user uploads a file, instead of storing the file somewhere on the web server, I want to base64 encode the file so I can send the file as a JSON payload. Any ideas on what would be the best way to approach this?

Answer

The simplest way I can put this is that I want to avoid saving the file to the web server entirely. I just want to encode the file, send it as a payload, and discard it, if that's possible.

From the django docs:

Upload Handlers

When a user uploads a file, Django passes off the file data to an upload handler – a small class that handles file data as it gets uploaded. Upload handlers are initially defined in the FILE_UPLOAD_HANDLERS setting, which defaults to:

["django.core.files.uploadhandler.MemoryFileUploadHandler", "django.core.files.uploadhandler.TemporaryFileUploadHandler"]

Together MemoryFileUploadHandler and TemporaryFileUploadHandler provide Django’s default file upload behavior of reading small files into memory and large ones onto disk.

You can write custom handlers that customize how Django handles files. You could, for example, use custom handlers to enforce user-level quotas, compress data on the fly, render progress bars, and even send data to another storage location directly without storing it locally. See Writing custom upload handlers for details on how you can customize or completely replace upload behavior.

Contrary thoughts:

I think you should consider sticking with the default file upload handlers because they keep someone from uploading a file that will overwhelm the server's memory.

Where uploaded data is stored

Before you save uploaded files, the data needs to be stored somewhere.

By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold the entire contents of the upload in memory. This means that saving the file involves only a read from memory and a write to disk and thus is very fast.

However, if an uploaded file is too large, Django will write the uploaded file to a temporary file stored in your system’s temporary directory. On a Unix-like platform this means you can expect Django to generate a file called something like /tmp/tmpzfp6I6.upload. If an upload is large enough, you can watch this file grow in size as Django streams the data onto disk.

These specifics – 2.5 megabytes; /tmp; etc. – are simply “reasonable defaults” which can be customized as described in the next section.


request.FILES info:

#forms.py:

from django import forms

class UploadFileForm(forms.Form):
    title = forms.CharField(max_length=50)
    json_file = forms.FileField()

A view handling this form will receive the file data in request.FILES, which is a dictionary containing a key for each FileField (or ImageField, or other FileField subclass) in the form. So the data from the above form would be accessible as request.FILES[‘json_file’].

Note that request.FILES will only contain data if the request method was POST and the that posted the request has the attribute enctype="multipart/form-data". Otherwise, request.FILES will be empty.


HttpRequest.FILES

A dictionary-like object containing all uploaded files. Each key in FILES is the name from the <input type="file" name="" />. Each value in FILES is an UploadedFile.


Upload Handlers

When a user uploads a file, Django passes off the file data to an upload handler – a small class that handles file data as it gets uploaded. Upload handlers are initially defined in the FILE_UPLOAD_HANDLERS setting, which defaults to:

["django.core.files.uploadhandler.MemoryFileUploadHandler", "django.core.files.uploadhandler.TemporaryFileUploadHandler"]

In the source code for TemporaryFileUploadHandler is this:

lass TemporaryFileUploadHandler(FileUploadHandler):
    """
    Upload handler that streams data into a temporary file.
    """
      ...
      ...
      def new_file(self, *args, **kwargs):
        """
        Create the file object to append to as data is coming in.
        """
        ...
        self.file = TemporaryUploadedFile(....)

And in the source code for TemporaryUploadedFile is this:

class TemporaryUploadedFile(UploadedFile):
    """
    A file uploaded to a temporary location (i.e. stream-to-disk).
    """
    def __init__(self, name, content_type, size, charset, content_type_extra=None):
        ...
        file = tempfile.NamedTemporaryFile(...)

And the python tempfile docs say this:

tempfile.NamedTemporaryFile(...., delete=True)
...
If delete is true (the default), the file is deleted as soon as it is closed.

Similarly, the other of the two default file upload handlers, MemoryFileUploadHandler, creates a file of type BytesIO:

A stream implementation using an in-memory bytes buffer. It inherits BufferedIOBase. The buffer is discarded when the close() method is called.

Therefore, all you have to do is close request.FILES[“field_name”] to erase the file (whether the file contents are stored in memory or on disk in the /tmp file directory), e.g.:

 uploaded_file = request.FILES[“json_file”]
 file_contents = uploaded_file.read()

 #Send file_contents to other server here.

 uploaded_file.close()  #erases file

If for some reason you don't want django to write to the server's /tmp file directory at all, then you'll need to write a custom file upload handler to reject uploaded files that are too large.