Embed_Programmer Embed_Programmer - 8 months ago 65
HTTP Question

HTTP Proxy Server keep-alive connection support

I am currently working on a multi threaded proxy server that supports keep-alive connections. I see some weird issues while handling requests from firefox browser. I connect to my local proxy using localhost:10001/http://url, and I can access all the links on this host. The process is as below.
1. Create a socket bind it to port 10001
2. Accept connections and if a client is connected fork()
3. Keep on processing the client request as persistent connection.

Now the problem is that when I open a new tab in firefox to access a second url with different host with using localhost:10001/http://url2, the strange thing is that that request goes to my client socket connection created during first connection. I initially thought that it might be due to my code, but then i tried to do the same using telnet and all the new connections would create a separate process. Are there any specific settings that is making firefox browser do this??

Answer Source

HTTP keep-alive is a way to reuse an underlying TCP connection for multiple requests so that one can skip the overhead of creating a new TCP connection all the time. Since the target of the connection is the same all the time in your case it makes sense for the browser to reuse the same TCP connection. The comparison with telnet is flawed since with telnet you do a new TCP connection all the time.

If HTTP keep-alive gets used is specified by the HTTP version the Connection header and on the behavior of both server and client. Both server and client can decide to close the idle connection any time after a request was done, i.e. they are not required to keep it open after the request is done. Additionally they can signal that they like to have the connection open by using the Connection: keep-alive HTTP header or that they like to close after the request with Connection: close. These headers have default values depending on the HTTP version, i.e. keep-alive is on with HTTP/1.1 while off with HTTP/1.0 unless explicitly specified.

Apart from that the "proxy" you are implementing with the use of URL's like http://proxy/real-url is not a real HTTP proxy. A real HTTP proxy would be configured as a proxy inside the browser and the URL's you use would stay the same which also means that no URL rewriting would need to be done by the proxy. Worse is that your idea of a proxy effectively merges all hosts inside the same origin (i.e. origin is the proxy) and thus effectively disables a major security concept of the browser: the same-origin policy. This means for example that some rogue advertisement server would share with your implementation the origin with ebay and thus could get access to the ebay cookies and hijack the session and misuse it for identity theft