tk. tk. - 2 months ago 7x
Python Question

how can i find out the uploaded file name in python cgi

i made simple web server like below.

import BaseHTTPServer, os, cgi
import cgitb; cgitb.enable()

html = """
<form action="" method="POST" enctype="multipart/form-data">
File upload: <input type="file" name="upfile">
<input type="submit" value="upload">
class Handler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_GET(self):
self.send_header("content-type", "text/html;charset=utf-8")

def do_POST(self):
ctype, pdict = cgi.parse_header(self.headers.getheader('content-type'))
if ctype == 'multipart/form-data':
query = cgi.parse_multipart(self.rfile, pdict)
upfilecontent = query.get('upfile')
if upfilecontent:
# i don't know how to get the file name.. so i named it 'tmp.dat'
fout = file(os.path.join('tmp', 'tmp.dat'), 'wb')
fout.write (upfilecontent[0])

if __name__ == '__main__':
server = BaseHTTPServer.HTTPServer(("", 8080), Handler)
print('web server on 8080..')

In the do_Post method of BaseHTTPRequestHandler, i got the uploaded file data successfully.

But i can't figure out how to get the original name of the uploaded file. is just a 'socket'
How can i get the uploaded file name?


Pretty broken code you're using there as a starting point (e.g. look at that global rootnode where name rootnode is used nowhere -- clearly half-edited source, and badly at that).

Anyway, what form are you using "client-side" for the POST? How does it set that upfile field?

Why aren't you using the normal FieldStorage approach, as documented in Python's docs? That way, you could use the .file attribute of the appropriate field to get a file-like object to read, or its .value attribute to read it all in memory and get it as a string, plus the .filename attribute of the field to know the uploaded file's name. More detailed, though concise, docs on FieldStorage, are here.

Edit: now that the OP has edited the Q to clarify, I see the problem: BaseHTTPServer does not set the environment according to the CGI specs, so the cgi module isn't very usable with it. Unfortunately the only simple approach to environment setting is to steal and hack a big piece of code from (wasn't intented for reuse, whence the need for, sigh, copy and paste coding), e.g....:

def populenv(self):
        path = self.path
        dir, rest = '.', 'ciao'

        # find an explicit query string, if present.
        i = rest.rfind('?')
        if i >= 0:
            rest, query = rest[:i], rest[i+1:]
            query = ''

        # dissect the part after the directory name into a script name &
        # a possible additional path, to be stored in PATH_INFO.
        i = rest.find('/')
        if i >= 0:
            script, rest = rest[:i], rest[i:]
            script, rest = rest, ''

        # Reference:
        # XXX Much of the following could be prepared ahead of time!
        env = {}
        env['SERVER_SOFTWARE'] = self.version_string()
        env['SERVER_NAME'] = self.server.server_name
        env['GATEWAY_INTERFACE'] = 'CGI/1.1'
        env['SERVER_PROTOCOL'] = self.protocol_version
        env['SERVER_PORT'] = str(self.server.server_port)
        env['REQUEST_METHOD'] = self.command
        uqrest = urllib.unquote(rest)
        env['PATH_INFO'] = uqrest
        env['SCRIPT_NAME'] = 'ciao'
        if query:
            env['QUERY_STRING'] = query
        host = self.address_string()
        if host != self.client_address[0]:
            env['REMOTE_HOST'] = host
        env['REMOTE_ADDR'] = self.client_address[0]
        authorization = self.headers.getheader("authorization")
        if authorization:
            authorization = authorization.split()
            if len(authorization) == 2:
                import base64, binascii
                env['AUTH_TYPE'] = authorization[0]
                if authorization[0].lower() == "basic":
                        authorization = base64.decodestring(authorization[1])
                    except binascii.Error:
                        authorization = authorization.split(':')
                        if len(authorization) == 2:
                            env['REMOTE_USER'] = authorization[0]
        if self.headers.typeheader is None:
            env['CONTENT_TYPE'] = self.headers.type
            env['CONTENT_TYPE'] = self.headers.typeheader
        length = self.headers.getheader('content-length')
        if length:
            env['CONTENT_LENGTH'] = length
        referer = self.headers.getheader('referer')
        if referer:
            env['HTTP_REFERER'] = referer
        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line[:1] in "\t\n\r ":
                accept = accept + line[7:].split(',')
        env['HTTP_ACCEPT'] = ','.join(accept)
        ua = self.headers.getheader('user-agent')
        if ua:
            env['HTTP_USER_AGENT'] = ua
        co = filter(None, self.headers.getheaders('cookie'))
        if co:
            env['HTTP_COOKIE'] = ', '.join(co)
        # XXX Other HTTP_* headers
        # Since we're setting the env in the parent, provide empty
        # values to override previously set values
                  'HTTP_USER_AGENT', 'HTTP_COOKIE', 'HTTP_REFERER'):
            env.setdefault(k, "")

This could be substantially simplified further, but not without spending some time and energy on that task:-(.

With this populenv function at hand, we can recode:

def do_POST(self):
    form = cgi.FieldStorage(fp=self.rfile)
    upfilecontent = form['upfile'].value
    if upfilecontent:
        fout = open(os.path.join('tmp', form['upfile'].filename), 'wb')

...and live happily ever after;-). (Of course, using any decent WSGI server, or even the demo one, would be much easier, but this exercise is instructive about CGI and its internals;-).