Who’s not getting gzip?

November 11, 2009 10:46 pm | 15 Comments

The article Use compression to make the web faster from the Google Code Blog contains some interesting information on why modern browsers that support compression don’t get compressed responses in daily usage. The culprit?

anti-virus software, browser bugs, web proxies, and misconfigured web servers.  The first three modify the web request so that the web server does not know that the browser can uncompress content. Specifically, they remove or mangle the Accept-Encoding header that is normally sent with every request.

This is hard to believe, but it’s true. Tony Gentilcore covers the full story in the chapter he wrote called “Going Beyond Gzipping” in my most recent book, including some strategies for correcting and working around the problem. (Check out Tony’s slides from Velocity 2009.) According to Tony:

a large web site in the United States should expect roughly 15% of visitors don’t indicate gzip compression support.

This blog post from Arvind Jain and Jason Glasgow contains additional information, including:

  • Users suffering from this problem experience a Google Search page that is 25% slower – 1600ms for compressed content versus 2000ms for uncompressed.
  • Google Search was able to force the content to be compressed (even though the browser didn’t request it), and improved page load times by 300ms.
  • Internet Explorer 6 downgrades to HTTP/1.0 and drops the Accept-Encoding request header when behind a proxy. For Google Search, 36% of the search results sent without compression were for IE6.

Is there something about your browser, proxy, or anti-virus software that’s preventing you from getting compressed content and slowing you down 25%? Test it out by visiting the browser compression test page.

15 Responses to Who’s not getting gzip?

  1. Yeah, ~10% of BBC visitors don’t support gzip compression. It was higher during the day (15-20%) but lower in the evenings and weekends (<10%). Pretty much puts the blame in the direction of corporate proxies.

  2. Is there anything that speaks against enabling mod_deflate or similar in servers? People I work with say they won’t enable those modules because “they had problems with several clients” or with “mobile or proxy users”. Is that still a problem nowadays? Nonworking, garbled sites due to enabling those modules are obviously not acceptable (even when it’s like 0.2% of the visitors).

    I’m thinking about cases where:
    – browsers accept gzipped content, but don’t work properly when they receive it
    – proxy servers (or similiar) fiddling around with headers so that browsers get compressed content when they only understand uncompressed one
    – mobile browsers send certain headers, but won’t work at all with gezipped content
    – etc.pp.

    Are there ANY hints, why enabling mod_deflate and the like would be counter-productive and would not work for certain clients. (I’m talking about non-working sites – not uncompressed, but still working sites)

    Thanks in advance.

  3. @sparetimeadmin: Years ago this was an issue, but not so much now. You could use a whitelist approach. See Chapter 4 of High Performance Web Sites (page 34).

  4. Thanks for the fast reply. Your books are on my list for a long time already. It’s about time, I guess. :)

  5. I just made a post on how to gzip your site, thought I would share. http://kennedysgarage.com/compress-components-with-gzip/

  6. Hi Steve,

    In your book you suggest creating a gzip whitelist of browsers, and setting cache-control: Private to prevent proxies from caching and then delivering the wrong version of zipped/unzipped pages to browsers.

    However I have come across an article that suggests that when using SSL, Firefox will not use persistent caching unless cache-control:public is set. See #3 on http://blog.httpwatch.com/2009/01/15/https-performance-tuning/

    This would require a different setup for SSL and non-SSL pages.
    I would be great to hear your opinion on this.

    Thanks
    Melvin

  7. Oops I’ve now read to the bottom of that httpwatch post and see that you’ve commented on #3 about cache-control: public !

  8. Just a heads up: that browser compression test page you linked to? It’s lying. It’s showing no accept-encoding header for me in every browser I tested. I figured our corporate firewall was misconfigured so I wrote my own little detection script and sure enough, chrome is sending “gzip,deflate,sdch” and firefox 4 beta “gzip, deflate”.

  9. @Dean: The point of that test is to tell users (like yourself) if they’re browser is suffering because compression is disabled. Compression is disabled in browsers when users are behind proxies or have anti-virus software that masks the Accept-Encoding header. It sounds like that’s exactly what you’ve discovered – your corporate firewall has disabled compression, so servers are sending you uncompressed responses resulting in a slower experience.

  10. @steve: We tried to gzip our files in our website( wordpress powered ), but as we are using shared hosting, the hosting provider doesn’t support gzipping of files. This causes our website more than a 500kb of performance downgrade. would you be kind enough to explain what are the guidelines for gzip, one should follow in shared hosting.

  11. @Ezhil: Your best bet is to search the support docs for your webhosting provider – or switch providers!

  12. Good article Steve.

    I wanted to let you know that the browser connection test link has a bad URL – The query at the end is no longer valid, it should now be http://www.browserscope.org/network/test

  13. Fixed, thanks Elijah.

  14. Hi,

    I have a question regarding gzip.
    Suppose the server adds the “Content-Encoding: gzip” header.
    But, the actual content is not gzipped (assuming the compression failed and it is sending the data without compression). What will be the impact of this?
    How will http clients (especially browsers) handle this?

    Thanks,
    Narendra

  15. @Narendra: It’s an interesting question but I don’t think we should spend too much time answering it. The scenario you describe is a violation of the spec so there is no “correct” behavior and the results will therefore differ across browsers as well as over time. Regardless, if you’d like an answer I recommend you create your own Browserscope user test and gather results. If so, send me the URL and I’ll tweet it.