locked
Extracting the chunked transfer encoding sizes from https traffic RRS feed

  • Question

  • i am working on a research project to check integrity of https traffic.  As part of this project, I need to know the original size of the html content when it was sent.  In HTTP 1.0, this is easily available via the header that specifies the content-length. However, HTTP 1.1 allows chunked transfer encoding, and when the server uses chunked transfer encoding, the HTTP header will not contain the content-length. The chunk sizes are specified in the chunks themselves, and unfortunately this information is not available via passthrough APP or the BHO.  Can someone please guide me the best way to approach this problem?  Basically, I want to be able to extract the chunk sizes somehow.
    Tuesday, November 13, 2012 2:57 AM

Answers

  • Cary Ng wrote:

    i am working on a research project to check integrity of https  traffic. As part of this project, I need to know the original size
    of the html content when it was sent. In HTTP 1.0, this is easily  available via the header that specifies the content-length.

    HTTP/1.0 allows the server to not specify Content-Length, but simply  close the connection to indicate end of response. Before the invention  of chunked encoding, that's what servers that generated response on the  fly would do.

    However, HTTP 1.1 allows chunked transfer encoding, and when the  server uses chunked transfer encoding, the HTTP header will not
    contain the content-length. The chunk sizes are specified in the  chunks themselves, and unfortunately this information is not
    available via passthrough APP or the BHO. Can someone please guide me  the best way to approach this problem?

    You could write your own HTTP client. I'm not aware of any existing HTTP  tool or library that would give you raw response, without decoding  chunked encoding first. You'll have to go down to socket level.

    Alternatively, you could set up something like Burp proxy  (http://www.portswigger.net/burp/ ). It can capture HTTPS traffic, by  acting as a man-in-the-middle.


    Igor Tandetnik

    • Marked as answer by Cary Ng Wednesday, November 14, 2012 2:04 PM
    Tuesday, November 13, 2012 6:38 AM