Code Code - 2 months ago 11x
HTTP Question

Python - Download file over HTTP and detect filetype automatically

I want download a file via HTTP, but all the examples online involve fetching the data and then putting it in a local file. The problem with this is that you need to explicitly set the filetype of the local file.

I want to download a file but I won't know the filetype of what I'm downloading.

This is what I currently have:


But if I download, say a XML file it will be CSV. Is there anyway to get python to detect the file that I get sent from a URL like:

Say the above URL gives me an XML I want python to detect that.


You can use python-magic to detect file type. It can be installed via "pip install python-magic".

I assume you are using python 2.7 since you are calling urlretreieve. The example is geared to 2.7, but it is easily adapted.

This is a working example:

import mimetypes # Detects mimetype
import magic  # Uses magic numbers to detect file type, and does so much better than the built in mimetypes
import urllib # Your library
import os     # for renaming your file
mime = magic.Magic(mime=True) 
output = "output" # Your file name without extension
urllib.urlretrieve("", output) # This is just an example url
mimes = mime.from_file(output) # Get mime type
ext = mimetypes.guess_all_extensions(mimes)[0] # Guess extension
os.rename(output, output+ext) # Rename file