Carl Dirzuweit Carl Dirzuweit - 1 month ago 8
HTML Question

How do I post this HTML form using Python?

This is the form that I want to post in Python:

<FORM METHOD="POST"
ACTION="http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool/run"
ENCTYPE="multipart/form-data">
<INPUT NAME="formtype" TYPE="HIDDEN" value="simple">

<p><b>Upload a sentence corpus file</b>:<br>
<INPUT NAME="corpus" TYPE="FILE" SIZE=60 VALUE="empty">
</p>

<INPUT TYPE="submit" VALUE="COMPILE KNOWLEDGE BASE">
</form>


i tried it with requests

import requests

url = "http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool/run"
#url = "http://httpbin.org/post"
files = {'file': open('testfile', 'rb')}
payload = {'NAME': 'fromtype', 'TYPE': 'HIDDEN', 'value': 'simple',
'NAME': 'corpus', 'TYPE': 'FILE', 'SIZE': '60', 'VALUE': 'empty'}
r = requests.post(url, data=payload)

print r.text


and the response

Running: /home/darthtoker/programs/post.py (Fri Oct 7 15:19:21 2016)

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US">
<head>
<title>LMTool Error</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body>
<pre>Something went wrong. This is all I know: formtype
</pre>
</body>
</html>Content-Type: text/html; charset=ISO-8859-1

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US">
<head>
<title>LMTool Error</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body>
<pre>Something went wrong. This is all I know: corpus
</pre>
</body>
</html>Status: 302 Found
Location: http://www.speech.cs.cmu.edu/tools/product/1475867962_31194


I need to automate this form for a project I'm working on, please help!

Answer

There might be other errors, but the error message you get is clear enough:

What the form expects:

<INPUT NAME="formtype" ...>

what you send:

'NAME': 'fromtype', ...

You see the difference between formtype and fromtype?


According to the documentations of Requests (More complicated POST requests, POST a Multipart-Encoded File) your code should be:

import requests

url = "http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool/run"
#url = "http://httpbin.org/post"
files = {'corpus': open('testfile', 'rb')}
payload = {'formtype': 'simple' }
r = requests.post(url, data=payload, files=files)

print r.text