wwarren wwarren - 5 months ago 80
Java Question

Websockets on Tomcat 8 + IIS 8 with ARR 3 are not working

I've scoured the internet trying to find anyone who might be experiencing this issue but come up empty handed. So here goes:

We have a java web application (based on Spring MVC 4). It sits behind Microsoft IIS acting as a load balancer / reverse proxy using Application Request Routing (ARR) v3.

This IIS is performing load balancing with ARR for 3 different environments (all running the same Java code):

dev.example.com
,
demo.example.com
and
qa.example.com
.

The application serves notifications to users' browsers using WebSockets via SockJS and stompjs and this has all been working well while the application servers were on Tomcat 7. After upgrading the
qa.example.com
environment to Tomcat 8, the WebSocket connections stopped working - it falls back to XHR POST requests.

I want to stress that no changes were made to IIS, just the
qa
application server.

Here is a sample request/response from the
dev
environment (working):


Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: no-cache
Connection: Upgrade
Cookie: <cookies snipped>
Host: dev.example.com
Origin: https://dev.example.com
Pragma: no-cache
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
Sec-WebSocket-Key: E7aIek0X6qcO9PAl1n6w4Q==
Sec-WebSocket-Version: 13
Upgrade: websocket
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36


Response

Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Connection: Upgrade
Date: Thu, 22 Oct 2015 02:19:35 GMT
Expires: 0
Pragma: no-cache
Sec-WebSocket-Accept: dKYK05s4eP87iA20aSo/3ntOrPU=
Server: Microsoft-IIS/8.0
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
Upgrade: Websocket
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Powered-By: ARR/3.0
X-XSS-Protection: 1; mode=block


Here is a sample request/response from the
qa
environment (broken):


Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: no-cache
Connection: Upgrade
Cookie: <cookies snipped>
Host: qa.example.com
Origin: https://qa.example.com
Pragma: no-cache
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
Sec-WebSocket-Key: jTOIAT0+o35+Qi0ZWh2gyQ==
Sec-WebSocket-Version: 13
Upgrade: websocket
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36


Response:

Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Connection: Upgrade
Date: Thu, 22 Oct 2015 02:18:30 GMT
Expires: 0
Pragma: no-cache
Sec-WebSocket-Accept: P+fEH8pvxcu3sEoO5fDizjSbwJc=
Sec-WebSocket-Extensions: permessage-deflate;client_max_window_bits=15
Server: Microsoft-IIS/8.0
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
Upgrade: Websocket
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Powered-By: ARR/3.0
X-XSS-Protection: 1; mode=block


The only obvious difference is that the
qa
response includes a
Sec-WebSocket-Extensions: permessage-deflate;client_max_window_bits=15
header while the
dev
response does not.

I turned on "Failed Request Tracing" on IIS to debug the
101
response and I can see that there are some headers that get overwritten by IIS - the
Sec-WebSocket-Accept
header namely.

IIS also shows that that request is creating a
502.5
error. I looked that up and found this: https://support.microsoft.com/en-us/kb/943891 which says that
502.5
is "WebSocket failure (ARR)" and that's all it says. Weirdly enough though, Chrome Dev Tools shows that it responds with a 101 just like it's supposed to...

I tried this all with a local application server (Tomcat 8 with no IIS) and the websockets worked just fine. Tomcat 7 + IIS + ARR + WebSockets works just fine. Tomcat 8 + IIS + ARR + WebSockets does not.

My exact version of Tomcat 8 is 8.0.28 - but I got the same results on Tomcat 8.0.26.

My next step it to keep downgrading Tomcat 8 through minor versions and see if anything changes. I will update here if I discover anything.

Update



Here's a response from my local server (no IIS):

Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Connection: upgrade
Date: Thu, 22 Oct 2015 13:59:23 GMT
Expires: 0
Pragma: no-cache
Sec-WebSocket-Accept: 718HnPxHN8crYYzNGFjQf7w8O+Y=
Sec-WebSocket-Extensions: permessage-deflate;client_max_window_bits=15
Server: Apache-Coyote/1.1
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
Upgrade: websocket
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block


It looks a lot like the broken
qa
request, but it works great. So I guess the
Sec-WebSocket-Extensions
thing was a red herring. Also
Upgrade: websocket
and
Connection: upgrade
is lower case on my local server, whereas it is
Websocket
and
Upgrade
when you put IIS in front.

Sec-WebSocket-Extensions
also has a trailing space in
qa
after the
permessage-deflate;
but the local does not.

Update 2



It all works fine on the
qa
environment in Microsoft Edge (Windows 10) I haven't tried Internet Explorer 11, but I have to assume it probably also works. Firefox and Chrome on OSX do not work.

Update 3



Request from Tomcat before it gets modified by IIS/ARR:

HTTP/1.1 101 Switching Protocols
Server: Apache-Coyote/1.1
Upgrade: websocket
Connection: upgrade
Sec-WebSocket-Accept: luP49lroNK9qTdaNNnSCLXnxAWc=
Sec-WebSocket-Extensions: permessage-deflate;client_max_window_bits=15
Date: Tue, 27 Oct 2015 21:10:48 GMT

Answer

I have discovered the solution, although it is not as satisfying as I wish it was.

In our project's pom.xml we had spring-core:4.2.5 but spring-websocket and spring-messaging were 4.1.6. The version mismatch was causing some issues clearly.

Setting -Dorg.apache.tomcat.websocket.DISABLE_BUILTIN_EXTENSIONS=true in the Tomcat startup options when the versions were mismatched had no effect. Setting that JVM option when the versions were the same worked as expected.

The 101 response now does not contain permessage-deflate and websockets are able to connect with no issues through IIS. Our application does not send a lot of data through the sockets so we were OK making this tradeoff.