Mrk Fldig Mrk Fldig - 1 month ago 19
C# Question

FiddlerCore decoding an sdch response

I'm getting an odd response from a site that I was looking to parse with FiddlerCore. In chrome developer tools if I inspect the response it looks completely normal, in fiddler it doesn't. Code snippet as follows(which used to work fine)

String html = oSession.GetResponseBodyAsString();


Returns the following, which isn't html, note this is a sample rather than the full huge string.

JRHwJNeR\0���\0\0\u0001��D\0�2�\b\0�\u0016�7]<!DOCTYPE html>\n win\">


It's also littered with "\n" and html like this

\n\n\n\n\n \n <meta name=\"treeID\" content=\"dwedxE+pgRQAWIHiFSsAAA==\">\n


Response headers are as follows:

Cache-Control:no-cache, no-store
Connection:keep-alive
Content-Encoding:sdch, gzip
Content-Language:en-US
Content-Type:text/html;charset=UTF-8
Date:Fri, 28 Oct 2016 10:17:02 GMT
Expires:Thu, 01 Jan 1970 00:00:00 GMT
Pragma:no-cache
Server:Apache-Coyote/1.1
Set-Cookie:lidc="b=VB87:g=518:u=60:i=1477649823:t=1477731496:s=AQG-LTdly5mcIjAtiRHIOrKE1TiRWW-l"; Expires=Sat, 29 Oct 2016 08:58:16 GMT; domain=.thedomain.com; Path=/
Set-Cookie:_lipt=deleteMe; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
Strict-Transport-Security:max-age=0
Transfer-Encoding:chunked
Vary:Accept-Encoding, Avail-Dictionary
X-Content-Type-Options:nosniff
X-Frame-Options:sameorigin
X-FS-UUID:882b3366afaa811400a04937a92b0000
X-Li-Fabric:prod-lva1
X-Li-Pop:prod-tln1-scalable
X-LI-UUID:iCszZq+qgRQAoEk3qSsAAA==
X-XSS-Protection:1; mode=block


Fiddler startup code:

Fiddler.FiddlerApplication.AfterSessionComplete += FiddlerApplication_OnAfterSessionComplete;
Fiddler.FiddlerApplication.BeforeResponse += delegate(Fiddler.Session oS) {
oS.utilDecodeResponse();
};

Fiddler.FiddlerApplication.Startup(0, FiddlerCoreStartupFlags.Default);


}


Initially i'd assumed it was chunked/gzipped so I added utilDecodeResponse(); to onBeforeResponse which had no effect!

Just to cover all the bases I'd also tried manually decoding responseBodyBytes in UTF-8, Unicode, Bigendian etc just on the off chance the response content type was incorrect AND disabled javascript and loaded the page to prove it wasn't some funky templating thingy, which also made no difference.

Any ideas?

UPDATE:

In line with the information provided by Developer & NineBerry below the solution is as follows:

In order to prevent the response being SDCH encoded you can add an handler like so:

Fiddler.FiddlerApplication.BeforeRequest += delegate (Fiddler.Session oS)
{
oS.oRequest["Accept-Encoding"] = "gzip, deflate, br";
};


It should be noted that this isn't suitable for everything as you are manually setting the headers rather than checking to see if SDCH is present and then removing it, for my purposes this works fine, but for using the general proxy features of fiddler you'd want more logic here.

Answer

The content encoding is shown as SDCH - Shared Dictionary Compression; so manually decoding responseBodyBytes in UTF-8, Unicode, Bigendian etc will not work in this case.

You can find more details about SDCH here -SDCH Ref 1 & SDCH Ref 2

Excerpts from the above site:

Shared Dictionary Compression is a content-encoding method which was proposed back in 2008 by Google, and is implemented in Chrome and supported by a number of Google servers. The full proposal can be obtained here -https://lists.w3.org/Archives/Public/ietf-http-wg/2008JulSep/att-0441/Shared_Dictionary_Compression_over_HTTP.pdf. Rather than replicate the contents of the document in this blog post, I’ll try and summarise as concisely as possible:
The whole idea of the protocol is to reduce redundancy across HTTP connections. The amount of ‘common data’ across HTTP responses is obviously significant - for example you will often see a website use common header/footers across a number of HTML pages. If the client were to store this common data locally in a ‘dictionary’, the server would only need to instruct the client how to reconstruct the page using that dictionary.

Comments