Abirdcfly Abirdcfly - 1 month ago 23
HTTP Question

How to prevent python requests from percent encoding URLs contain semicolon?

I want to post some data to a url like

http://www.google.com/;id=aaa


I use follow codes:

url = 'http://www.google.com/;id=aaa'
r = requests.post(url, headers=my_headers, data=my_data, timeout=10)


Unfortunately, I find
requests
just cut my uri to
http://www.google.com/
without any warning...

Is there some way to pass the the parameters in their original form - without percent encoding?

I try
config={'encode_uri': False}
but it was abandoned, and
urllib.unquote
wasn't useful as well.

Thanks!

Answer

RFC 2616, section 3.2.2 specifies the syntax of an HTTP URL as:

http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

It also says:

If the abs_path is not present in the URL, it MUST be given as "/" when used as a Request-URI for a resource

In this URL:

http://www.google.com;id=aaa

there is no /, so there is no abs_path and there is no :, so there is no port. It means that www.google.com;id=aaa is hostname.

Semicolons are not allowed in hostnames (see this answer for what is allowed in hostname), so this URL is invalid.

This would be a valid URL, if id=aaa should be part of the path:

http://www.google.com/;id=aaa

This also, if id=aaa should be part of the query:

http://www.google.com/?;id=aaa

EDIT

The question has been modified to ask about http://www.google.com/;id=aaa instead.

That URL is valid, and as far as I was able to test it, handles it without any problems.

Comments