The article Use compression to make the web faster from the Google Code Blog contains some interesting information on why modern browsers that support compression don’t get compressed responses in daily usage. The culprit?
anti-virus software, browser bugs, web proxies, and misconfigured web servers. The first three modify the web request so that the web server does not know that the browser can uncompress content. Specifically, they remove or mangle the Accept-Encoding header that is normally sent with every request.
This is hard to believe, but it’s true. Tony Gentilcore covers the full story in the chapter he wrote called “Going Beyond Gzipping” in my most recent book, including some strategies for correcting and working around the problem. (Check out Tony’s slides from Velocity 2009.) According to Tony:
a large web site in the United States should expect roughly 15% of visitors don’t indicate gzip compression support.
This blog post from Arvind Jain and Jason Glasgow contains additional information, including:
- Users suffering from this problem experience a Google Search page that is 25% slower – 1600ms for compressed content versus 2000ms for uncompressed.
- Google Search was able to force the content to be compressed (even though the browser didn’t request it), and improved page load times by 300ms.
- Internet Explorer 6 downgrades to HTTP/1.0 and drops the Accept-Encoding request header when behind a proxy. For Google Search, 36% of the search results sent without compression were for IE6.
Is there something about your browser, proxy, or anti-virus software that’s preventing you from getting compressed content and slowing you down 25%? Test it out by visiting the browser compression test page.
Jake Archibald | 12-Nov-09 at 12:49 am | Permalink
Yeah, ~10% of BBC visitors don’t support gzip compression. It was higher during the day (15-20%) but lower in the evenings and weekends (<10%). Pretty much puts the blame in the direction of corporate proxies.
sparetimeadmin | 17-Nov-09 at 10:44 am | Permalink
Is there anything that speaks against enabling mod_deflate or similar in servers? People I work with say they won’t enable those modules because “they had problems with several clients” or with “mobile or proxy users”. Is that still a problem nowadays? Nonworking, garbled sites due to enabling those modules are obviously not acceptable (even when it’s like 0.2% of the visitors).
I’m thinking about cases where:
- browsers accept gzipped content, but don’t work properly when they receive it
- proxy servers (or similiar) fiddling around with headers so that browsers get compressed content when they only understand uncompressed one
- mobile browsers send certain headers, but won’t work at all with gezipped content
- etc.pp.
Are there ANY hints, why enabling mod_deflate and the like would be counter-productive and would not work for certain clients. (I’m talking about non-working sites – not uncompressed, but still working sites)
Thanks in advance.
Steve Souders | 18-Nov-09 at 9:47 am | Permalink
@sparetimeadmin: Years ago this was an issue, but not so much now. You could use a whitelist approach. See Chapter 4 of High Performance Web Sites (page 34).
sparetimeadmin | 18-Nov-09 at 2:53 pm | Permalink
Thanks for the fast reply. Your books are on my list for a long time already. It’s about time, I guess. :)
Kennedy | 21-Nov-09 at 10:36 am | Permalink
I just made a post on how to gzip your site, thought I would share. http://kennedysgarage.com/compress-components-with-gzip/
Melvin | 11-Feb-10 at 7:25 am | Permalink
Hi Steve,
In your book you suggest creating a gzip whitelist of browsers, and setting cache-control: Private to prevent proxies from caching and then delivering the wrong version of zipped/unzipped pages to browsers.
However I have come across an article that suggests that when using SSL, Firefox will not use persistent caching unless cache-control:public is set. See #3 on http://blog.httpwatch.com/2009/01/15/https-performance-tuning/
This would require a different setup for SSL and non-SSL pages.
I would be great to hear your opinion on this.
Thanks
Melvin
Melvin | 11-Feb-10 at 7:28 am | Permalink
Oops I’ve now read to the bottom of that httpwatch post and see that you’ve commented on #3 about cache-control: public !