ruby life questions ruby life questions - 7 months ago 10
Ruby Question

Ruby http, net/http, httpclient: can't parse www.victoriassecret.com

I am using

httpclient
gem, it works fine on Windows, just moved to AWS EC2, tried it on https://victoriassecret.com and it gets this response:

= Response

HTTP/1.1 920 Unknown
Content-Type: text/html
Date: Wed, 21 Oct 2015 21:42:51 GMT
Connection: Keep-Alive
Content-Length: 23

<h1>File not found</h1>#<HTTP::Message:0x000000023f5168
@http_body=
#<HTTP::Message::Body:0x000000023f50a0
@body="<h1>File not found</h1>",
@chunk_size=nil,
@positions=nil,
@size=0>,
@http_header=
#<HTTP::Message::Headers:0x000000023f5140
@body_charset=nil,
@body_date=nil,
@body_encoding=#<Encoding:ASCII-8BIT>,
@body_size=0,
@body_type=nil,
@chunked=false,
@dumped=false,
@header_item=
[["Content-Type", "text/html"],
["Date", "Wed, 21 Oct 2015 21:42:51 GMT"],
["Connection", "Keep-Alive"],
["Content-Length", "23"]],
@http_version="1.1",
@is_request=false,
@reason_phrase="Unknown",
@request_absolute_uri=nil,
@request_method="GET",
@request_query=nil,
@request_uri=
#<URI::HTTPS:0x000000023f58c0 URL:https://www.victoriassecret.com/pink/new-and-now>,
@status_code=920>,
@peer_cert=
#<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name:0x000000024ebe00>, issuer=#<OpenSSL::X509::Name:0x000000024ebec8>, serial=#<OpenSSL::BN:0x000000024de110>, not_before=2015-05-27 00:00:00 UTC, not_after=2017-05-26 23:59:59 UTC>,
@previous=nil>


It does not work only with this website,
httpclient get https://google.com
for example works fine. But on Windows I get normal response from
httpclient get https://www.victoriassecret.com
. Butt when using standard NET/HTTP library I get the same 920 response on Windows.

Answer

This isn't ec2 related. It's most likely related to the User Agent header sent by the various http library implementations.

For example, they clearly don't like 'wget':

curl -A "Wget/1.13.4 (linux-gnu)"  -v https://www.victoriassecret.com
* Rebuilt URL to: https://www.victoriassecret.com/
*   Trying 98.158.54.100...
* Connected to www.victoriassecret.com (98.158.54.100) port 443 (#0)
* TLS 1.2 # truncated
> GET / HTTP/1.1
> Host: www.victoriassecret.com
> User-Agent: Wget/1.13.4 (linux-gnu)
> Accept: */*
>
< HTTP/1.1 910 Unknown
< Content-Type: text/html
< Date: Thu, 22 Oct 2015 01:16:31 GMT
< Connection: Keep-Alive
< Content-Length: 23
<
* Connection #0 to host www.victoriassecret.com left intact
<h1>File not found</h1>%