Cache compressed? or uncompressed?

March 27, 2012 4:05 pm | 10 Comments

My previous blog post, Cache them if you can, suggests that current cache sizes are too small – especially on mobile.

Given this concern about cache size a relevant question is:

If a response is compressed, does the browser save it compressed or uncompressed?

Compression typically reduces responses by 70%. This means that a browser can cache 3x as many compressed responses if they’re saved in their compressed format.

Note that not all responses are compressed. Images make up the largest number of resources but shouldn’t be compressed. On the other hand, HTML documents, scripts, and stylesheets should be compressed and account for 30% of all requests. Being able to save 3x as many of these responses to cache could have a significant impact on cache hit rates.

It’s difficult and time-consuming to determine whether compressed responses are saved in compressed format. I created this Caching Gzip Test page to help determine browser behavior. It has two 200 KB scripts – one is compressed down to ~148 KB and the other is uncompressed. (Note that this file is random strings so the compression savings is only 25% as compared to the typical 70%.) After clearing the cache and loading the test page if the total cache disk size increases ~348 KB it means the browser saves compressed responses as compressed. If the total cache disk size increases ~400 KB it means compressed responses are saved uncompressed.

The challenging part of this experiment is finding where the cache is stored and measuring the response sizes. Firefox, Chrome, and Opera save responses as files and were easy to measure. For IE on Windows I wasn’t able to access the individual cache files (admin permissions?) but was able to measure the sizes based on the properties of the Temporary Internet Files folder. Safari saves all responses in Cache.db. I was able to see the incremental increase by modifying the experiment to be two pages: the compressed response and the uncompressed response. You can see the cache file locations and full details in the Caching Gzip Test Results page.

Here are the results for top desktop browsers:

Browser Compressed responses
cached compressed?
max cache size
Chrome 17 yes 320 MB*
Firefox 11 yes 850 MB*
IE 8 no 50 MB
IE 9 no 250 MB
Safari 5.1.2 no unknown
Opera 11 yes 20 MB

* Chrome and Firefox cache size is a percentage of available disk space. Chrome is capped at 320 MB. I don’t know what Firefox’s cap is; on my laptop with 50 GB free the cache size is 830 MB.

We see that Chrome 17, Firefox 11, and Opera 11 store compressed responses in compressed format, while IE 8&9 and Safari 5 save them uncompressed. IE 8&9 have smaller cache sizes, so the fact that they uncompress responses before caching further reduces the number of responses that can be cached.

What’s the best choice? It’s possible that reading cached responses is faster if they’re already uncompressed. That would be a good next step to explore. I wouldn’t prejudge IE’s choice when it comes to performance on Windows. But it’s clear that saving compressed responses in compressed format increases the number of responses that can be cached, and this increases cache hit rates. What’s even clearer is that browsers don’t agree on the best answer. Should they?

 

10 Responses to Cache compressed? or uncompressed?

  1. “It’s possible that reading cached responses is faster if they’re already uncompressed.”

    Shouldn’t though, I think the same logic of compressed file systems can be applied here: the most expensive operation will be completed faster and uncompressing them is so cheap it should still be faster than reading all the uncompressed files.

    Would be cool to test though!

  2. Reading from disk a compressed file is less bytes so less of a I/O hit, however there is a CPU hit with uncompressing it. I guess IE/Safari might have taken this route to lessen the CPU hit at the expense the file might have lower life span on cache.

    I was able to confirm your finding in IE8 with http://www.amazon.com on both JavaScript and CSS. You might not be able to see it on Vista or Win7 if protected mode enabled. You can try disabling it on Security tab. (or try it on WinXP)

  3. IE had a number of problems with caching and compression in the IE6 days. Specifically it would cache content compressed, but than read it expecting it to be uncompressed. I imagine this legacy is a reason for the IE behavior you are seeing.

    As to which is faster, that is kind of tricky to figure out. My first thought is Disk I/O is slower than CPU, so reading 400K will be slower than reading 200K and decompressing, especially give how underutilized our multi-core CPUs are. Of course, most of the slowness of Disk I/O is seeking to the data, so reading 400K isn’t really that much slower than 200K if the data is contiguous.

    On the other hand, your point about small cache sizes means the browser’s cache is likely memory mapped or somehow cached in RAM already, so it fast.

  4. For more information on the IE6 compression/caching issues, check out the blog post we wrote “Lose the Wait: HTTP Compression”, specifically the “IE6 and Netscape 4 are screwing you. Again” section.

    http://zoompf.com/blog/2012/02/lose-the-wait-http-compression

  5. If compressed in cache is better than uncompressed in cache, it might even make sense to compress uncompressed responses when putting them in cache.

  6. I don’t know about the format of the cached copy on a phone but I can comment on the decompression costs.

    For phone apps which must decompress content in code, the perf differences running on minimum spec and good spec phones can be quite staggering.

    To the extent that the decision whether to request a resource gzip-ed (which is a no brainer on a good spec phone – and not even a question on a browser) can actually become quite an interesting one.

    (Request to usable response can be substantively *slower* on a good connection i.e. one where ttfb to ttlb is at the small end.)

    And, of course, because the phone is thrashing zlib, this has a knock on effect on other processing.

  7. Opera max cache size is 400 mb not 20 http://i40.tinypic.com/2co05ft.gif

  8. @Billy: Thanks for the info about IE’s past issues with compressed resources.

    @zcorpan: That’s a great idea. It seems like there must be a best answer of which is faster, and that should be applied to all resources.

    @Slavash: I did a clean install of Opera on my Macbook Air (with 50GB available space) and it came up with “20 MB” as the default size. I searched but couldn’t find any documentation about the default size. I think we need a reference. If anyone has done a clean install please comment on what your default size is.

  9. Just a minor note: compressing the text responses to 30% doesn’t necessarily allow you to cache 3x as many of these. Much of the extra space in the cache is likely to be taken by more images being cached. This may or may not improve performances. But, the point is that the browsers, to my knowledge, do not maintain a separate caching quota for text responses versus images.

  10. To me it looks like Firefox has the best policy if we assume that decompressing cached resources is faster than retrieving them again due to network and server overhead.

    Firefox’s dynamic cache size makes a lot of sense assuming that cache disc access does not slow down markedly when the cache size grows beyond a certain size. That would be another test worth doing!