SERIOUS CONFUSION with Resource Timing
or “Duration includes Blocking”
Resource Timing is a great way to measure how quickly resources download. Unfortunately, almost everyone I’ve spoken with does this using the “duration” attribute and are not aware that “duration” includes blocking time. As a result, “duration” time values are (much) greater than the actual download time, giving developers unexpected results. This issue is especially bad for cross-origin resources where “duration” is the only metric available. In this post I describe the problem and a proposed solution.
Resource Timing review
The Resource Timing specification defines APIs for gathering timing metrics for each resource in a web page. It’s currently available in Chrome, Chrome for Android, IE 10-11, and Opera. You can gather a list of PerformanceEntry objects using getEntries(), getEntriesByType(), and getEntriesByName(). A PerformanceEntry has these properties:
- name – the URL
- entryType – typically “resource”
- startTime – time that the resource started getting processed (in milliseconds relative to page navigation)
- duration – total time to process the resource (in milliseconds)
The properties above are available for all resources – both same-origin and cross-origin. However, same-origin resources have additional properties available as defined by the PerformanceResourceTiming interface. They’re self-explanatory and occur pretty much in this chronological order:
- redirectStart
- redirectEnd
- fetchStart
- domainLookupStart
- domainLookupEnd
- connectStart
- connectEnd
- secureConnectionStart
- requestStart
- responseStart
- responseEnd
Here’s the canonical processing model graphic that shows the different phases. Note that “duration” is equal to (responseEnd – startTime).
Take a look at my post Resource Timing Practical Tips for more information on how to use Resource Timing.
Unexpected blocking bloat in “duration”
The detailed PerformanceResourceTiming properties are restricted to same-origin resources for privacy reasons. (Note that any resource can be made “same-origin” by using the Timing-Allow-Origin response header.) About half of the resources on today’s websites are cross-origin, so “duration” is the only way to measure their load time. And even for same-origin resources, “duration” is the only delta provided, presumably because it’s the most important phase to measure. As a result, all of the Resource Timing implementations I’ve seen use “duration” as the primary performance metric.
Unfortunately, “duration” is more than download time. It also includes “blocking time” – the delay between when the browser realizes it needs to download a resource to the time that it actually starts downloading the resource. Blocking can occur in several situations. The most typical is when there are more resources than TCP connections. Most browsers only open 6 TCP connections per hostname, the exceptions being IE10 (8 connections), and IE11 (12 connections).
This Resource Timing blocking test page has 16 images, so some images incur blocking time no matter which browser is used. Each of the images is programmed on the server to have a 1 second delay. The “startTime” and “duration” are displayed for each of the 16 images. Here are WebPagetest results for this test page being loaded in Chrome, IE10, and IE11. You can look at the screenshots to read the timing results. Note how “startTime” is approximately the same for all images. That’s because this is the time that the browser parsed the IMG tag and realized it needed to download the resource. But the “duration” values increase in steps of ~1 second for the images that occur later in the page. This is because they are blocked from downloading by the earlier images.
In Chrome, for example, the images are downloaded in three sets – because Chrome only downloads 6 resources at a time. The first six images have a “duration” of ~1.3 seconds (the 1 second backend delay plus some time for establishing the TCP connections and downloading the response body). The next six images have a “duration” of ~2.5 seconds. The last four images have a “duration” of ~3.7 seconds. The second set is blocked for ~1 second waiting for the first set to finish. The third set is blocked for ~2 seconds waiting for sets 1 & 2 to finish.
Even though the “duration” values increase from 1 to 2 to 3 seconds, the actual download time for all images is ~1 second as shown by the WebPagetest waterfall chart.
The results are similar for IE10 and IE11. IE10 starts with six parallel TCP connections but then ramps up to eight connections. IE11 also starts with six parallel TCP connections but then ramps up to twelve. Both IE10 and IE11 exhibit the same problem – even though the load time for every image is ~1 second, “duration” shows values ranging from 1-3 seconds.
proposal: “networkDuration”
Clearly, “duration” is not an accurate way to measure resource load times because it may include blocking time. (It also includes redirect time, but that occurs much less frequently.) Unfortunately, “duration” is the only metric available for cross-origin resources. Therefore, I’ve submitted a proposal to the W3C Web Performance mailing list to add “networkDuration” to Resource Timing. This would be available for both same-origin and cross-origin resources. (I’m flexible about the name; other candidates include “networkTime”, “loadTime”, etc.)
The calculation for “networkDuration” is as follows. (Assume “r” is a PerformanceResourceTiming object.)
dns = r.domainLookupEnd - r.domainLookupStart; tcp = r.connectEnd - r.connectStart; // includes ssl negotiation waiting = r.responseStart - r.requestStart; content = r.responseEnd - r.responseStart; networkDuration = dns + tcp + waiting + content;
Developers working with same-origin resources can do the same calculations as shown above to derive “networkDuration”. However, providing the result as a new attribute simplifies the process. It also avoids possible errors as it’s likely that companies and teams will compare these values, so it’s important to ensure an apples-to-apples comparison. But the primary need for “networkDuration” is for cross-origin resources. Right now, “duration” is the only metric available for cross-origin resources. I’ve found several teams that were tracking “duration” assuming it meant download time. They were surprised when I explained that it also including blocking time, and agreed it was not the metric they wanted; instead they wanted the equivalent of “networkDuration”.
I mentioned previously that the detailed time values (domainLookupStart, connectStart, etc.) are restricted to same-origin resources for privacy reasons. The proposal to add “networkDuration” is likely to raise privacy concerns; specifically that by removing blocking time, “networkDuration” would enable malicious third party JavaScript to determine whether a resource was read from cache. However, it’s possible to remove blocking time today using “duration” by loading a resource when there’s no blocking contention (e.g., after window.onload). Even when blocking time is removed, it’s ambiguous whether a resource was read from cache or loaded over the network. Even a cache read will have non-zero load times.
The problem that “networkDuration” solves is finding the load time for more typical resources that are loaded during page creation and might therefore incur blocking time.
Takeaway
It’s not possible today to use Resource Timing to measure load time for cross-origin resources. Companies that want to measure load time and blocking time can use “duration”, but all the companies I’ve spoken with want to measure the actual load time (without blocking time). To provide better performance metrics, I encourage the addition of “networkDuration” to the Resource Timing specification. If you agree, please voice your support in a reply to my “networkDuration” proposal on the W3C Web Performance mailing list.
Reloading post-onload resources
Two performance best practices are to add a far future expiration date and to delay loading resources (esp. scripts) until after the onload event. But it turns out that the combination of these best practices leads to a situation where it’s hard for users to refresh resources. More specifically, hitting Reload (or even shift+Reload) doesn’t refresh these cacheable, lazy-loaded resources in Firefox, Chrome, Safari, Android, and iPhone.
What we expect from Reload
The browser has a cache (or 10) where it saves copies of responses. If the user feels those cached responses are stale, she can hit the Reload button to ignore the cache and refetch everything, thus ensuring she’s seeing the latest copy of the website’s content. I couldn’t find anything in the HTTP Spec dictating the behavior of the Reload button, but all browsers have this behavior AFAIK:
- If you click Reload (or control+R or command+R) then all the resources are refetched using a Conditional GET request (with the If-Modified-Since and If-None-Match validators). If the server’s version of the response has not changed, it returns a short “304 Not Modified” status with no response body. If the response has changed then “200 OK” and the entire response body is returned.
- If you click shift+Reload (or control+Reload or control+shift+R or command+shift+R) then all the resources are refetched withOUT the validation headers. This is less efficient since every response body is returned, but guarantees that any cached responses that are stale are overwritten.
Bottomline, regardless of expiration dates we expect that hitting Reload gets the latest version of the website’s resources, and shift+Reload will do so even more aggressively.
Welcome to Reload 2.0
In the days of Web 1.0, resources were requested using HTML markup – IMG, SCRIPT, LINK, etc. With Web 2.0 resources are often requested dynamically. Two common examples are loading scripts asynchronously (e.g., Google Analytics) and dynamically fetching images (e.g., for photo carousels or images below-the-fold). Sometimes these resources are requested after window onload so that the main page can render quickly for a better user experience, better metrics, etc. If these resources have a far future expiration date, the browser needs extra intelligence to do the right thing.
- If the user navigates to the page normally (clicking on a link, typing a URL, using a bookmark, etc.) and the dynamic resource is in the cache, the browser should use the cached copy (assuming the expiration date is still in the future).
- If the user reloads the page, the browser should refetch all the resources including resources loaded dynamically in the page.
- If the user reloads the page, I would think resources loaded in the onload handler should also be refetched. These are likely part of the basic construction of the page and they should be refetched if the user wants to refresh the page’s contents.
- But what should the browser do if the user reloads the page and there are resources loaded after the onload event? Some web apps are long lived with sessions that last hours or even days. If the user does a reload, should every dynamically-loaded resource for the life of the web app be refetched ignoring the cache?
An Example
Let’s look at an example: Postonload Reload.
This page loads an image and a script using five different techniques:
- markup – The basic HTML approach:
<img src=
and<script src=
. - dynamic in body – In the body of the page is a script block that creates an image and a script element dynamically and sets the SRC causing the resource to be fetched. This code executes before onload.
- onload – An image and a script are dynamically created in the onload handler.
- 1 ms post-onload – An image and a script are dynamically created via a 1 millisecond setTimeout callback in the onload handler.
- 5 second post-onload – An image and a script are dynamically created via a 5 second setTimeout callback in the onload handler.
All of the images and scripts have an expiration date one month in the future. If the user hits Reload, which of the techniques should result in a refetch? Certainly we’d expect techniques 1 & 2 to cause a refetch. I would hope 3 would be refetched. I think 4 should be refetched but doubt many browsers do that, and 5 probably shouldn’t be refetched. Settle on your expected results and then take a look at the table below.
The Results
Before jumping into the Reload results, let’s first look at what happens if the user just navigates to the page. This is achieved by clicking on the “try again” link in the example. In this case none of the resources are refetched. All of the resources have been saved to the cache with an expiration date one month in the future, so every browser I tested just reads them from cache. This is good and what we would expect.
But the behavior diverges when we look at the Reload results captured in the following table.
Table 1. Resources that are refetched on Reload | ||||||||
technique | resource | Chrome 25 | Safari 6 | Android Safari/534 | iPhone Safari/7534 | Firefox 19 | IE 8,10 | Opera 12 |
---|---|---|---|---|---|---|---|---|
markup | image 1 | Y | Y | Y | Y | Y | Y | Y |
script 1 | Y | Y | Y | Y | Y | Y | Y | |
dynamic | image 2 | Y | Y | Y | Y | Y | Y | Y |
script 2 | Y | Y | Y | Y | Y | Y | Y | |
onload | image 3 | – | – | – | – | Y | Y | Y |
script 3 | – | – | – | – | – | Y | Y | |
1ms postonload | image 4 | – | – | – | – | – | – | Y |
script 4 | – | – | – | – | – | – | Y | |
5sec postonload | image 5 | – | – | – | – | – | – | – |
script 5 | – | – | – | – | – | – | – |
The results for Chrome, Safari, Android mobile Safari, and iPhone mobile Safari are the same. When you click Reload in these browsers the resources in the page get refetched (resources 1&2), but not so for the resources loaded in the onload handler and later (resources 3-5).
Firefox is interesting. It loads the four resources in the page plus the onload handler’s image (image 3), but not the onload handler’s script (script 3). Curious.
IE 8 and 10 are the same: they load the four resources in the page as well as the image & script from the onload handler (resources 1-3). I didn’t test IE 9 but I assume it’s the same.
Opera has the best results in my opinion. It refetches all of the resources in the main page, the onload handler, and 1 millisecond after onload (resources 1-4), but it does not refetch the resources 5 seconds after onload (image 5 & script 5). I poked at this a bit. If I raise the delay from 1 millisecond to 50 milliseconds, then image 4 & script 4 are not refetched. I think this is a race condition where if Opera is still downloading resources from the onload handler when these first delayed resources are created, then they are also refetched. To further verify this I raised the delay to 500 milliseconds and confirmed the resources were not refetched, but then increased the response time of all the resources to 1 second (instead of instantaneous) and this caused image 4 & script 4 to be refetched, even though the delay was 500 milliseconds after onload.
Note that pressing shift+Reload (and other combinations) didn’t alter the results.
Takeaways
A bit esoteric? Perhaps. This is a deep dive on a niche issue, I’ll grant you that. But I have a few buts:
If you’re a web developer using far future expiration dates and lazy loading, you might get unexpected results when you change a resource and hit Reload, and even shift+Reload. If you’re not getting the latest version of your dev resources you might have to clear your cache.
This isn’t just an issue for web devs. It affects users as well. Numerous sites lazy-load resources with far future expiration dates including 8 of the top 10 sites: Google, YouTube, Yahoo, Microsoft Live, Tencent QQ, Amazon, and Twitter. If you Reload any of these sites with a packet sniffer open in the first four browsers listed, you’ll see a curious pattern: cacheable resources loaded before onload have a 304 response status, while those after onload are read from cache and don’t get refetched. The only way to ensure you get a fresh version is to clear your cache, defeating the expected benefit of the Reload button.
Here’s a waterfall showing the requests when Amazon is reloaded in Chrome. The red vertical line marks the onload event. Notice how the resources before onload have 304 status codes. Right after the onload are some image beacons that aren’t cacheable, so they get refetched and return 200 status codes. The cacheable images loaded after onload are all read from cache, so any updates to those resources are missed.
Finally, whenever behavior varies across browsers it’s usually worthwhile to investigate why. Often one behavior is preferred over another, and we should get the specs and vendors aligned in that direction. In this case, we should make Reload more consistent and have it refetch resources, even those loaded dynamically in the onload handler.
Preferred Caching
In Clearing Browser Data I mentioned Internet Explorer’s “Preserve Favorites website data” option. It first appeared in Internet Explorer 8’s Delete Browsing History dialog:
This is a great idea. I advocated this to all browser makers at Velocity 2009 under the name “preferred caching”. My anecdotal experience is that even though I visit a website regularly, its static resources aren’t always in my cache. This could be explained by the resources not having the proper caching headers, but I’ve seen this happen for websites that I know use far future expiration dates. The browser disk cache is shared across all websites, therefore one website with lots of big resources can push another website’s resources from cache. Even though I have preferred websites, there’s no way for me to guarantee their resources are given a higher priority when it comes to caching. And yet:
A key factor for getting web apps to perform like desktop and native apps is ensuring that their static content is immediately available on the client.
A web app (especially a mobile web app) can’t startup quickly if it has to download its images, scripts, and stylesheets over the Internet. It can’t even afford to do lightweight Conditional GET requests to verify resource freshness. A fast web app has to have all its static content stored locally, ready to be served as soon as the user needs it.
Application cache and localStorage are ways to solve this problem, but users only see the benefits if/when their preferred websites adopt these new(ish) technologies. And there’s no guarantee a favorite website will ever make the necessary changes. Application cache is hard to work with and has lots of gotchas. Some browser makers warn that localStorage is bad for performance.
The beauty of “Preserve Favorites website data” is that the user can opt in without having to wait for the website owner to get on board. This doesn’t solve all the problems: if a website starts using new resources then the user will have to undergo the delay of waiting for them to download. But at least the user knows that the unchanged resources for their preferred websites will always be loaded without having to incur network delays.
Let’s play with it
I found very little documentation on “Preserve Favorites website data”. Perhaps it’s self-explanatory, but I find that nothing works as expected when it comes to the Web. So I did a little testing. I used the Clear Browser test to explore what actually was and wasn’t cleared. The first step was to save https://stevesouders.com/ as a Favorite. I then ran the test withOUT selecting “Preserve Favorites website data” – everything was deleted as expected:
- IE 8 & 9: cookies & disk cache ARE cleared (appcache isn’t supported; localStorage isn’t cleared likely due to a UI bug)
- IE 10: cookies, disk cache, & appcache ARE cleared (localStorage isn’t cleared likely due to a UI bug)
Then I ran the test again but this time I DID select “Preserve Favorites website data” (which tells IE to NOT clear anything for stevesouders.com
) and saw just one surprise for IE 10:
- IE 8 & 9: cookies & disk cache ARE NOT cleared (correct)
- IE 10: cookies & disk cache ARE NOT cleared (correct), but appcache IS cleared (unexpected)
Why would appcache get cleared for a Favorite site when “Preserve Favorites website data” is selected? That seems like a bug. Regardless, for more popular persistent data (cookies & disk cache) this setting leaves them in the cache allowing for a faster startup next time the user visits their preferred sites.
It’s likely to be a long time before we see other browsers adopt this feature. To figure out whether they should it’d be helpful to get your answers to these two questions:
- How many of you using Internet Explorer select “Preserve Favorites website data” when you clear your cache?
- For non-IE users, would you use this feature if it was available in your browser?
Clearing Browser Data
In Keys to a Fast Web App I listed caching as a big performance win. I’m going to focus on caching for the next few months. The first step is a study I launched a few days ago called the Clear Browser Experiment. Before trying to measure the frequency and benefits of caching, I wanted to start by gauging what happens when users clear their cache. In addition to the browser disk cache, I opened this up to include other kinds of persistent data: cookies, localStorage, and application cache. (I didn’t include indexedDB because it’s less prevalent.)
Experiment Setup
The test starts by going to the Save Stuff page. That page does four things:
- Sets a persistent cookie called “cachetest”. The cookie is created using JavaScript.
- Tests
window.localStorage
to see if localStorage is supported. If so, saves a value to localStorage with the key “cachetest”. - Tests
window.applicationCache
to see if appcache is supported. If so, loads an iframe that uses appcache. The iframe’s manifest file has a fallback section containing “iframe404.js iframe.js”. Since iframe404.js doesn’t exist, appcache should instead load iframe.js which defines the global variableiframejsNow
. - Loads an image that takes 5 seconds to return. The image is cacheable for 30 days.
After this data is saved to the browser, the user is prompted to clear their browser and proceed to the Check Cache page. This page checks to see if the previous items still exist:
- Looks for the cookie called “cachetest”.
- Checks localStorage for the key “cachetest”.
- Loads the iframe again. The iframe’s onload handler checks ifÂ
iframejsNow
is defined in which case appcache was not cleared. - Loads the same 5-second image again. The image’s onload handler checks if it takes more than 5 seconds to return, in which case disk cache was cleared.
My Results
I created a Browserscope user test to store the results. (If you haven’t used this feature you should definitely check it out. Jason Grigsby is glad he did.) This test is different from my other tests because it requires the user to do an action. Because of this I ran the test on various browsers to create a “ground truth” table of results. Green and “1” indicate the data was cleared successfully while red and “0” indicate it wasn’t. Blank means the feature wasn’t supported.
The results vary but are actually more consistent than I expected. Some observations:
- Chrome 21 doesn’t clear localStorage. This is perhaps an aberrant result due to the structure of the test. Chrome 21 clears localStorage but does not clear it from memory in the current tab. If you switch tabs or restart Chrome the result is cleared. Nevertheless, it would be better to clear it from memory as well. The Chrome team has already fixed this bug as evidenced by the crowdsourced results for Chrome 23.0.1259 and later.
- Firefox 3.6 doesn’t clear disk cache. The disk cache issue is similar to Chrome 21’s story: the image is cleared from disk cache, but not from memory cache. Ideally both would be cleared and the Firefox team fixed this bug back in 2010.
- IE 6-7 don’t support appcache nor localStorage.
- IE 8-9 don’t support appcache.
- Firefox 3.6, IE 8-9, and Safari 5.0.5 don’t clear localStorage. My hypothesis for this result is that there is no UI attached to localStorage. See the following section on Browser UIs for screenshots from these browsers.
Browser UIs
Before looking at the crowdsourced results, it’s important to see what kind of challenges the browser UI presents to clearing data. This is also a handy, although lengthy, guide to clearing data for various browsers. (These screenshots and menu selections were gathered from both Mac and Windows so you might see something different.)
Chrome
Clicking on the wrench icon -> History -> Clear all browsing data… in Chrome 21 displays this dialog. Checking “Empty the cache” clears the disk cache, and “Delete cookies and other site and plug-in data” clears cookies, localStorage, and appcache.
Firefox
To clear Firefox 3.6 click on Tools -> Clear Recent History… and check Cookies and Cache. To be extra aggressive I also checked Site Preferences, but none of these checkboxes cause localStorage to be cleared.
Firefox 12 fixed this issue by adding a checkbox for Offline Website Data. Firefox 15 has the same choices. As a result localStorage is cleared successfully.
Internet Explorer
It’s a bit more work to clear IE 6. Clicking on Tools -> Internet Options… -> General presents two buttons: Delete Cookies… and Delete Files… Both need to be selected and confirmed to clear the browser. There’s an option to “Delete all offline content” but since appcache and localStorage aren’t supported this isn’t relevant to this experiment.
Clearing IE 7 is done by going to Tools -> Delete Browsing History… There are still separate buttons for deleting files and deleting cookies, but there’s also a “Delete all…” button to accomplish everything in one action.
The clearing UI for IE 8 is reached via Tools -> Internet Options -> General -> Delete… It has one additional checkbox for clearing data compared to IE 7: “InPrivate Filtering data”. (It also has “Preserve Favorites website data” which I’ll discuss in a future post.) LocalStorage is supported in IE 8, but according to the results it’s not cleared and now we see a likely explanation: there’s no checkbox for explicitly clearing offline web data (as there is in Firefox 12+).
IE 9’s clearing UI changes once again. “Download History” is added, and “InPrivate Filtering data” is replaced with “ActiveX Filtering and Tracking Protection data”. Similar to IE 8, localStorage is supported in IE 9, but according to the results it’s not cleared and the likely explanation is that there is not a checkbox for explicitly clearing offline web data (as there is in Firefox 12+).
iPhone
The iPhone (not surprisingly) has the simplest UI for clearing browser data. Going through Settings -> Safari we find a single button: “Clear Cookies and Data”. Test results show that this clears cookies, localStorage, appcache, and disk cache. It’s hard to run my test on the iPhone because you have to leave the browser to get to Settings, so when you return to the browser the test page has been cleared. I solve this by typing in the URL for the next page in the test: https://stevesouders.com/tests/clearbrowser/check.php
.
Opera
Opera has the most granularity in what to delete. In Opera 12 going to Tools -> Delete Private Data… displays a dialog box. The key checkboxes for this test are “Delete all cookies”, “Delete entire cache”, and “Delete persistent storage”. There’s a lot to choose from but it’s all in one dialog.
Safari
In Safari 5.0.5 going to gear icon -> Reset Safari… (on Windows) displays a dialog box with many choices. None of them address “offline data” explicitly which is likely why the results show that localStorage is not cleared. (“Clear history” has no effect – I just tried it to see if it would clear localStorage.)
Safari 5.1.7 was confusing for me. At first I chose Safari -> Empty cache… (on Mac) but realized this only affected disk cache. I also saw Safari -> Reset Safari… but this only had “Remove all website data” which seemed too vague and broad. I went searching for more clearing options and found Safari -> Preferences… -> Privacy with a button to “Remove All Website Data…” but this also had no granularity of what to clear. This one button did successfully clear cookies, localStorage, appcache, and disk cache.
Crowdsourced Results
The beauty of Browserscope user tests is that you can crowdsource results. This allows you to gather results for browsers and devices that you don’t have access to. The crowdsourced results for this experiment include ~100 different browsers including webOS, Blackberry, and RockMelt. I extracted results for the same major browsers shown in the previous table.
In writing up the experiment I intentionally used the generic wording “clear browser” in an attempt to see what users would select without prompting. The crowdsourced results match my personal results fairly closely. Since this test required users to take an action without a way to confirm that they actually did try to clear their cache, I extended the table to show the percentage of tests that reflect the majority result.
One difference between my results and the crowdsourced results is iPhone – I think this is due to the fact that clearing the cache requires leaving the browser thus disrupting the experiment. The other main difference is Safari 5. The sample size is small so we shouldn’t draw strong conclusions. It’s possible that having multiple paths for clearing data caused confusion.
The complexity and lack of consistency in UIs for clearing data could be a cause for these less-than-unanimous results. Chrome 21 and Firefox 15 both have a fair number of tests (154 and 46), and yet some of the winning results are only seen 50% or 68% of the time. Firefox might be a special case because it prompts before saving information to localStorage. And this test might be affected by private browsing such as Chrome’s incognito mode. Finally, it’s possible test takers didn’t want to clear their cache but still saved their results to Browserscope.
There are many possible explanations for the varied crowdsourced results, but reviewing the Browser UIs section shows that knowing how to clear browser data varies significantly across browsers, and often changes from one version of a browser to the next. There’s a need for more consistent UI in this regard. What would a consistent UI look like? The simple iPhone UI (a single button) makes it easier to clear everything, but is that what users want and need? I often want to clear my disk cache, but less frequently choose to clear all my cookies. At a minimum, users need a way to clear their browser data in all browsers, which currently isn’t possible.
Cache compressed? or uncompressed?
My previous blog post, Cache them if you can, suggests that current cache sizes are too small – especially on mobile.
Given this concern about cache size a relevant question is:
If a response is compressed, does the browser save it compressed or uncompressed?
Compression typically reduces responses by 70%. This means that a browser can cache 3x as many compressed responses if they’re saved in their compressed format.
Note that not all responses are compressed. Images make up the largest number of resources but shouldn’t be compressed. On the other hand, HTML documents, scripts, and stylesheets should be compressed and account for 30% of all requests. Being able to save 3x as many of these responses to cache could have a significant impact on cache hit rates.
It’s difficult and time-consuming to determine whether compressed responses are saved in compressed format. I created this Caching Gzip Test page to help determine browser behavior. It has two 200 KB scripts – one is compressed down to ~148 KB and the other is uncompressed. (Note that this file is random strings so the compression savings is only 25% as compared to the typical 70%.) After clearing the cache and loading the test page if the total cache disk size increases ~348 KB it means the browser saves compressed responses as compressed. If the total cache disk size increases ~400 KB it means compressed responses are saved uncompressed.
The challenging part of this experiment is finding where the cache is stored and measuring the response sizes. Firefox, Chrome, and Opera save responses as files and were easy to measure. For IE on Windows I wasn’t able to access the individual cache files (admin permissions?) but was able to measure the sizes based on the properties of the Temporary Internet Files folder. Safari saves all responses in Cache.db. I was able to see the incremental increase by modifying the experiment to be two pages: the compressed response and the uncompressed response. You can see the cache file locations and full details in the Caching Gzip Test Results page.
Here are the results for top desktop browsers:
Browser | Compressed responses cached compressed? |
max cache size |
---|---|---|
Chrome 17 | yes | 320 MB* |
Firefox 11 | yes | 850 MB* |
IE 8 | no | 50 MB |
IE 9 | no | 250 MB |
Safari 5.1.2 | no | unknown |
Opera 11 | yes | 20 MB |
* Chrome and Firefox cache size is a percentage of available disk space. Chrome is capped at 320 MB. I don’t know what Firefox’s cap is; on my laptop with 50 GB free the cache size is 830 MB.
We see that Chrome 17, Firefox 11, and Opera 11 store compressed responses in compressed format, while IE 8&9 and Safari 5 save them uncompressed. IE 8&9 have smaller cache sizes, so the fact that they uncompress responses before caching further reduces the number of responses that can be cached.
What’s the best choice? It’s possible that reading cached responses is faster if they’re already uncompressed. That would be a good next step to explore. I wouldn’t prejudge IE’s choice when it comes to performance on Windows. But it’s clear that saving compressed responses in compressed format increases the number of responses that can be cached, and this increases cache hit rates. What’s even clearer is that browsers don’t agree on the best answer. Should they?
Cache them if you can
“The fastest HTTP request is the one not made.”
I always smile when I hear a web performance speaker say this. I forget who said it first, but I’ve heard it numerous times at conferences and meetups over the past few years. It’s true! Caching is critical for making web pages faster. I’ve written extensively about caching:
- Call to improve browser caching
- (lack of) Caching for iPhone Home Screen Apps
- Redirect caching deep dive
- Mobile cache file sizes
- Improving app cache
- Storager case study: Bing, Google
- App cache & localStorage survey
- HTTP Archive: max-age
Things are getting better – but not quickly enough. The chart below from the HTTP Archive shows that the percentage of resources that are cacheable has increased 10% during the past year (from 42% to 46%). Over that same time the number of requests per page has increased 12% and total transfer size has increased 24% (chart).
Perhaps it’s hard to make progress on caching because the problem doesn’t belong to a single group – responsibility spans website owners, third party content providers, and browser developers. One thing is certain – we have to do a better job when it comes to caching.Â
I’ve gathered some compelling statistics over the past few weeks that illuminate problems with caching and point to some next steps. Here are the highlights:
- 55% of resources don’t specify a max-age value
- 46% of the resources without any max-age remained unchanged over a 2 week period
- some of the most popular resources on the Web are only cacheable for an hour or two
- 40-60% of daily users to your site don’t have your resources in their cache
- 30% of users have a full cache
- for users with a full cache, the median time to fill their cache is 4 hours of active browsing
Read on to understand the full story.
My kingdom for a max-age header
Many of the caching articles I’ve written address issues such as size & space limitations, bugs with less common HTTP headers, and outdated purging logic. These are critical areas to focus on. But the basic function of caching hinges on websites specifying caching headers for their resources. This is typically done using max-age in the Cache-Control response header. This example specifies that a response can be read from cache for 1 year:
Cache-Control: max-age=31536000
Since you’re reading this blog post you probably already use max-age, but the following chart from the HTTP Archive shows that 55% of resources don’t specify a max-age value. This translates to 45 of the average website’s 81 resources needing a HTTP request even for repeat visits.
Missing max-age != dynamic
Why do 55% of resources have no caching information? Having looked at caching headers across thousands of websites my first guess is lack of awareness – many website owners simply don’t know about the benefits of caching. An alternative explanation might be that many resources are dynamic (JSON, ads, beacons, etc.) and shouldn’t be cached. Which is the bigger cause – lack of awareness or dynamic resources? Luckily we can quantify the dynamicness of these uncacheable resources using data from the HTTP Archive.
The HTTP Archive analyzes the world’s top ~50K web pages on the 1st and 15th of the month and records the HTTP headers for every resource. Using this history it’s possible to go back in time and quantify how many of today’s resources without any max-age value were identical in previous crawls. The data for the chart above (showing 55% of resources with no max-age) was gathered on Feb 15 2012. The chart below shows the percentage of those uncacheable resources that were identical in the previous crawl on Feb 1 2012. We can go back even further and see how many were identical in both the Feb 1 2012 and the Jan 15 2012 crawls. (The HTTP Archive doesn’t save response bodies so the determination of “identical” is based on the resource having the exact same URL, Last-Modified, ETag, and Content-Length.)
46% of the resources without any max-age remained unchanged over a 2 week period. This works out to 21 resources per page that could have been read from cache without any HTTP request but weren’t. Over a 1 month period 38% are unchanged – 17 resources per page.
This is a significant missed opportunity. Here are some popular websites and the number of resources that were unchanged for 1 month but did not specify max-age:
- http://www.toyota.jp/ – 172 resources without max-age & unchanged for 1 month
- http://www.sfgate.com/ – 133
- http://www.hasbro.com/ – 122
- http://www.rakuten.co.jp/ – 113
- http://www.ieee.org/ – 97
- http://www.elmundo.es/ – 80
- http://www.nih.gov/ – 76
- http://www.frys.com/ – 68
- http://www.foodnetwork.com/ – 66
- http://www.irs.gov/ – 58
- http://www.ca.gov/ – 53
- http://www.oracle.com/ – 52
- http://www.blackberry.com/ – 50
Recalling that “the fastest HTTP request is the one not made”, this is a lot of unnecessary HTTP traffic. I can’t prove it, but I strongly believe this is not intentional – it’s just a lack of awareness. The chart below reinforces this belief – it shows the percentage of resources (both cacheable and uncacheable) that remain unchanged starting from Feb 15 2012 and going back for one year.
The percentage of resources that are unchanged is nearly the same when looking at all resources as it is for only uncacheable resources: 44% vs. 46% going back 2 weeks and 35% vs. 38% going back 1 month. Given this similarity in “dynamicness” it’s likely that the absence of max-age has nothing to do with the resources themselves and is instead caused by website owners overlooking this best practice.
3rd party content
If a website owner doesn’t make their resources cacheable, they’re just hurting themselves (and their users). But if a 3rd party content provider doesn’t have good caching behavior it impacts all the websites that embed that content. This is both bad a good. It’s bad in that one uncacheable 3rd party resource can impact multiple sites. The good part is that shifting 3rd party content to adopt good caching practices also has a magnified effect.
So how are we doing when it comes to caching 3rd party content? Below is a list of the top 30 most-used resources according to the HTTP Archive. These are the resources that were used the most across the world’s top 50K web pages. The max-age value (in hours) is also shown.
- http://www.google-analytics.com/ga.js (2 hours)
- http://ssl.gstatic.com/s2/oz/images/stars/po/Publisher/sprite2.png (8760 hours)
- http://pagead2.googlesyndication.com/pagead/js/r20120208/r20110914/show_ads_impl.js (336 hours)
- http://pagead2.googlesyndication.com/pagead/render_ads.js (336 hours)
- http://pagead2.googlesyndication.com/pagead/show_ads.js (1 hour)
- https://apis.google.com/_/apps-static/_/js/gapi/gcm_ppb,googleapis_client,plusone/[…] (720 hours)
- http://pagead2.googlesyndication.com/pagead/osd.js (24 hours)
- http://pagead2.googlesyndication.com/pagead/expansion_embed.js (24 hours)
- https://apis.google.com/js/plusone.js (1 hour)
- http://googleads.g.doubleclick.net/pagead/drt/s?safe=on (1 hour)
- http://static.ak.fbcdn.net/rsrc.php/v1/y7/r/ql9vukDCc4R.png (3825 hours)
- http://connect.facebook.net/rsrc.php/v1/yQ/r/f3KaqM7xIBg.swf (164 hours)
- https://ssl.gstatic.com/s2/oz/images/stars/po/Publisher/sprite2.png (8760 hours)
- https://apis.google.com/_/apps-static/_/js/gapi/googleapis_client,iframes_styles[…] (720 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yv/r/ZSM9MGjuEiO.js (8742 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yx/r/qP7Pvs6bhpP.js (8699 hours)
- https://plusone.google.com/_/apps-static/_/ss/plusone/[…] (720 hours)
- http://b.scorecardresearch.com/beacon.js (336 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yx/r/lP_Rtwh3P-S.css (8710 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yA/r/TSn6F7aukNQ.js (8760 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yk/r/Wm4bpxemaRU.js (8702 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yZ/r/TtnIy6IhDUq.js (8699 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yy/r/0wf7ewMoKC2.css (8699 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yO/r/H0ip1JFN_jB.js (8760 hours)
- http://platform.twitter.com/widgets/hub.1329256447.html (87659 hours)
- http://static.ak.fbcdn.net/rsrc.php/v1/yv/r/T9SYP2crSuG.png (8699 hours)
- http://platform.twitter.com/widgets.js (1 hour)
- https://plusone.google.com/_/apps-static/_/js/plusone/[…] (720 hours)
- http://pagead2.googlesyndication.com/pagead/js/graphics.js (24 hours)
- http://s0.2mdn.net/879366/flashwrite_1_2.js (720 hours)
There are some interesting patterns.
- simple URLs have short cache times – Some resources have very short cache times, e.g., ga.js (1), show_ads.js (5), and twitter.com/widgets.js (27). Most of the URLs for these resources are very simple (no querystring or URL “fingerprints”) because these resource URLs are part of the snippet that website owners paste into their page. These “bootstrap” resources are given short cache times because there’s no way for the resource URL to be changed if there’s an emergency fix – instead the cached resource has to expire in order for the emergency update to be retrieved.
- long URLs have long cache times – Many 3rd party “bootstrap” scripts dynamically load other resources. These code-generated URLs are typically long and complicated because they contain some unique fingerprinting, e.g., http://pagead2.googlesyndication.com/pagead/js/r20120208/r20110914/show_ads_impl.js (3) and http://platform.twitter.com/widgets/hub.1329256447.html (25). If there’s an emergency change to one of these resources, the fingerprint in the bootstrap script can be modified so that a new URL is requested. Therefore, these fingerprinted resources can have long cache times because there’s no need to rev them in the case of an emergency fix.
- where’s Facebook’s like button? – Facebook’s like.php and likebox.php are also hugely popular but aren’t in this list because the URL contains a querystring that differs across every website. Those resources have an even more aggressive expiration policy compared to other bootstrap resources – they use
no-cache, no-store, must-revalidate
. Once the like[box] bootstrap resource is loaded, it loads the other required resources: lP_Rtwh3P-S.css (19), TSn6F7aukNQ.js (20), etc. Those resources have long URLs and long cache times because they’re generated by code, as explained in the previous bullet. - short caching resources are often async – The fact that bootstrap scripts have short cache times is good for getting emergency updates, but is bad for performance because they generate many Conditional GET requests on subsequent requests. We all know that scripts block pages from loading, so these Conditional GET requests can have a significant impact on the user experience. Luckily, some 3rd party content providers are aware of this and offer async snippets for loading these bootstrap scripts mitigating the impact of their short cache times. This is true for ga.js (1), plusone.js (9), twitter.com/widgets.js (27), and Facebook’s like[box].php.
These extremely popular 3rd party snippets are in pretty good shape, but as we get out of the top widgets we quickly find that these good caching patterns degrade. In addition, more 3rd party providers need to support async snippets.
Cache sizes are too small
In January 2007 Tenni Theurer and I ran an experiment at Yahoo! to estimate how many users had a primed cache. The methodology was to embed a transparent 1×1 image in the page with an expiration date in the past. If users had the expired image in their cache the browser would issue a Conditional GET request and receive a 304 response (primed cache). Otherwise they’d get a 200 response (empty cache). I was surprised to see that 40-60% of daily users to the site didn’t have the site’s resources in their cache and 20% of page views were done without the site’s resources in the cache.
Numerous factors contribute to this high rate of unique users missing the site’s resources in their cache, but I believe the primary reason is small cache sizes. Browsers have increased the size of their caches since this experiment was run, but not enough. It’s hard to test browser cache size. Blaze.io’s article Understanding Mobile Cache Sizes shows results from their testing. Here are the max cache sizes I found for browsers on my MacBook Air. (Some browsers set the cache size based on available disk space, so let me mention that my drive is 250 GB and has 54 GB available.) I did some testing and searching to find max cache sizes for my mobile devices and IE.
- Chrome: 320 MB
- Internet Explorer 9: 250 MB
- Firefox 11: 830 MB (shown in about:cache)
- Opera 11: 20 MB (shown in Preferences | Advanced | History)
- iPhone 4, iOS 5.1: 30-35 MB (based on testing)
- Galaxy Nexus: 18 MB (based on testing)
I’m surprised that Firefox 11 has such a large cache size – that’s almost close to what I want. All the others are (way) too small. 18-35 MB on my mobile devices?! I have seven movies on my iPhone – I’d gladly trade Iron Man 2Â (1.82 GB) for more cache space.
Caching in the real world
In order to justify increasing browser cache sizes we need some statistics on how many real users overflow their cache. This topic came up at last month’s Velocity Summit where we had representatives from Chrome, Internet Explorer, Firefox, Opera, and Silk. (Safari was invited but didn’t show up.) Will Chan from the Chrome team (working on SPDY) followed-up with this post on Chromium cache metrics from Windows Chrome. These are the most informative real user cache statistics I’ve ever seen. I strongly encourage you to read his article.
Some of the takeaways include:
- ~30% of users have a full cache (capped at 320 MB)
- for users with a full cache, the median time to fill their cache is 4 hours of active browsing (20 hours of clock time)
- 7% of users clear their cache at least once per week
- 19% of users experience “fatal cache corruption” at least once per week thus clearing their cache
The last stat about cache corruption is interesting – I appreciate the honesty. The IE 9 team experienced something similar. In IE 7&8 the cache was capped at 50 MB based on tests showing increasing the cache size didn’t improve the cache hit rate. They revisited this surprising result in IE9 and found that larger cache sizes actually did improve the cache hit rate:
In IE9, we took a much closer look at our cache behaviors to better understand our surprising finding that larger caches were rarely improving our hit rate. We found a number of functional problems related to what IE treats as cacheable and how the cache cleanup algorithm works. After fixing these issues, we found larger cache sizes were again resulting in better hit rates, and as a result, we’ve changed our default cache size algorithm to provide a larger default cache.
Will mentions that Chrome’s 320 MB cap should be revisited. 30% seems like a low percentage for full caches, but could be accounted for by users that aren’t very active and active users that only visit a small number of websites (for example, just Gmail and Facebook). If possible I’d like to see these full cache statistics correlated with activity. It’s likely that user who account for the biggest percentage of web visits are more likely to have a full cache, and thus experience slower page load times.
Next steps
First, much of the data for this post came from the HTTP Archive, so I’d like to thank our sponsors: Google, Mozilla, New Relic, O’Reilly Media, Etsy, Strangeloop, dynaTrace Software, and Torbit.
The data presented here suggest a few areas to focus on:
Website owners need to increase their use of a Cache-Control max-age, and the max-age times need to be longer. 38% of resources were unchanged over a 1 month period, and yet only 11% of resources have a max-age value that high. Most resources, even if they change, can be refreshed by including a fingerprint in the URL specified in the HTML document. Only bootstrap scripts from 3rd parties should have short cache times (hours). Truly dynamic responses (JSON, etc.) should specify must-revalidate. A year from now rather than seeing 55% of resources without any max-age value we should see 55% cacheable for a month or more.
3rd party content providers need wider adoption of the caching and async behavior shown by the top Google, Twitter, and Facebook snippets.
Browser developers stand to bring the biggest improvements to caching. Increasing cache sizes is a likely win, especially for mobile devices. Data correlating cache sizes and user activity is needed. More intelligence around purging algorithms, such as IE 9’s prioritization based on mime type, will help when the cache fills up. More focus on personalization (what are the sites I visit most often?) would also create a faster user experience when users go to their favorite websites.
It’s great that the number of resources with caching headers grew 10% over the last year, but that just isn’t enough progress. We should really expect to double the number of resources that can be read from cache over the coming year. Just think about all those HTTP requests that can be avoided!
Frontend SPOF
My evangelism of high performance web sites started off in the context of quality code and development best practices. It’s easy for a style of coding to permeate throughout a company. Developers switch teams. Code is copied and pasted (especially in the world of web development). If everyone is developing in a high performance way, that’s the style that will characterize how the company codes.
This argument of promoting development best practices gained traction in the engineering quarters of the companies I talked to, but performance improvements continued to get backburnered in favor of new features and content that appealed to the business side of the organization. Improving performance wasn’t considered as important as other changes. Everyone assumed users wanted new features and that’s what got the most attention.
It became clear to me that we needed to show a business case for web performance. That’s why the theme for Velocity 2009 was “the impact of performance on the bottom line”. Since then there have been numerous studies released that have shown that improving performance does improve the bottom line. As a result, I’m seeing the business side of many web companies becoming strong advocates for Web Performance Optimization.
But there are still occasions when I have a hard time convincing a team that focusing on web performance, specifically frontend performance, is important. Shaving off hundreds (or even thousands) of milliseconds just doesn’t seem worthwhile to them. That’s when I pull out the big guns and explain that loading scripts and stylesheets in the typical way creates a frontend single point of failure that can bring down the entire site.
Examples of Frontend SPOF
The thought that simply adding a script or stylesheet to your web page could make the entire site unavailable surprises many people. Rather than focusing on CSS mistakes and JavaScript errors, the key is to think about what happens when a resource request times out. With this clue, it’s easy to create a test case:
<html> <head> <script src="http://www.snippet.com/main.js" type="text/javascript"> </script> </head> <body> Here's my page! </body> </html>
This HTML page looks pretty normal, but if snippet.com is overloaded the entire page is blank waiting for main.js to return. This is true in all browsers.
Here are some examples of frontend single points of failure and the browsers they impact. You can click on the Frontend SPOF test links to see the actual test page.
Frontend SPOF test | Chrome | Firefox | IE | Opera | Safari |
---|---|---|---|---|---|
External Script | blank below | blank below | blank below | blank below | blank below |
Stylesheet | flash | flash | blank below | flash | blank below |
inlined @font-face | delayed | flash | flash | flash | delayed |
Stylesheet with @font-face | delayed | flash | totally blank* | flash | delayed |
Script then @font-face | delayed | flash | totally blank* | flash | delayed |
* Internet Explorer 9 does not display a blank page, but does “flash” the element.
The failure cases are highlighted in red. Here are the four possible outcomes sorted from worst to best:
- totally blank – Nothing in the page is rendered – the entire page is blank.
- blank below – All the DOM elements below the resource in question are not rendered.
- delayed – Text that uses the @font-face style is invisible until the font file arrives.
- flash – DOM elements are rendered immediately, and then redrawn if necessary after the stylesheet or font has finished downloading.
Web Performance avoids SPOF
It turns out that there are web performance best practices that, in addition to making your pages faster, also avoid most of these frontend single points of failure. Let’s look at the tests one by one.
- External ScriptÂ
- All browsers block rendering of elements below an external script until the script arrives and is parsed and executed. Since many sites put scripts in the HEAD, this means the entire page is typically blank. That’s why I believe the most important web performance coding pattern for today’s web sites is to load JavaScript asynchronously. Not only does this improve performance, but it avoids making external scripts a possible SPOF.Â
- StylesheetÂ
- Browsers are split on how they handle stylesheets. Firefox and Opera charge ahead and render the page, and then flash the user if elements have to be redrawn because their styling changed. Chrome, Internet Explorer, and Safari delay rendering the page until the stylesheets have arrived. (Generally they only delay rendering elements below the stylesheet, but in some cases IE will delay rendering everything in the page.) If rendering is blocked and the stylesheet takes a long time to download, or times out, the user is left staring at a blank page. There’s not a lot of advice on loading stylesheets without blocking page rendering, primarily because it would introduce the flash of unstyled content.
- inlined @font-faceÂ
- I’ve blogged before about the performance implications of using @font-face. When the @font-face style is declared in a STYLE block in the HTML document, the SPOF issues are dramatically reduced. Firefox, Internet Explorer, and Opera avoid making these custom font files a SPOF by rendering the affected text and then redrawing it after the font file arrives. Chrome and Safari don’t render the customized text at all until the font file arrives. I’ve drawn these cells in yellow since it could cause the page to be unusable for users using these browsers, but most sites only use custom fonts on a subset of the page.
- Stylesheet with @font-faceÂ
- Inlining your @font-face style is the key to avoiding having font files be a single point of failure. If you inline your @font-face styles and the font file takes forever to return or times out, the worst case is the affected text is invisible in Chrome and Safari. But at least the rest of the page is visible, and everything is visible in Firefox, IE, and Opera. Moving the @font-face style to a stylesheet not only slows down your site (by requiring two sequential downloads to render text), but it also creates a special case in Internet Explorer 7 & 8 where the entire page is blocked from rendering. IE 6 is only slightly better – the elements below the stylesheet are blocked from rendering (but if your stylesheet is in the HEAD this is the same outcome).
- Script then @font-faceÂ
- Inlining your @font-face style isn’t enough to avoid the entire page SPOF that occurs in IE. You also have to make sure the inline STYLE block isn’t preceded by a SCRIPT tag. Otherwise, your entire page is blank in IE waiting for the font file to arrive. If that file is slow to return, your users are left staring at a blank page.
SPOF is bad
Five years ago most of the attention on web performance was focused on the backend. Since then we’ve learned that 80% of the time users wait for a web page to load is the responsibility of the frontend. I feel this same bias when it comes to identifying and guarding against single points of failure that can bring down a web site – the focus is on the backend and there’s not enough focus on the frontend. For larger web sites, the days of a single server, single router, single data center, and other backend SPOFs are way behind us. And yet, most major web sites include scripts and stylesheets in the typical way that creates a frontend SPOF. Even more worrisome – many of these scripts are from third parties for social widgets, web analytics, and ads.
Look at the scripts, stylesheets, and font files in your web page from a worst case scenario perspective. Ask yourself:
- Is your web site’s availability dependent on these resources?
- Is it possible that if one of these resources timed out, users would be blocked from seeing your site?
- Are any of these single point of failure resources from a third party?
- Would you rather embed resources in a way that avoids making them a frontend SPOF?
Make sure you’re aware of your frontend SPOFs, track their availability and latency closely, and embed them in your page in a non-blocking way whenever possible.
Call to improve browser caching
Over Christmas break I wrote Santa my browser wishlist. There was one item I neglected to ask for: improvements to the browser disk cache.
In 2007 Tenni Theurer and I ran an experiment to measure browser cache stats from the server side. Tenni’s write up, Browser Cache Usage – Exposed, is the stuff of legend. There she reveals that while 80% of page views were done with a primed cache, 40-60% of unique users hit the site with an empty cache at least once per day. 40-60% seems high, but I’ve heard similar numbers from respected web devs at other major sites.
Why do so many users have an empty cache at least once per day?
I’ve been racking my brain for years trying to answer this question. Here are some answers I’ve come up with:
- first time users – Yea, but not 40-60%.
- cleared cache – It’s true: more and more people are likely using anti-virus software that clears the cache between browser sessions. And since we ran that experiment back in 2007 many browsers have added options for clearing the cache frequently (for example, Firefox’s privacy.clearOnShutdown.cache option). But again, this doesn’t account for the 40-60% number.
- flawed experiment – It turns out there was a flaw in the experiment (browsers ignore caching headers when an image is in memory), but this would only affect the 80% number, not the 40-60% number. And I expect the impact on the 80% number is small, given the fact that other folks have gotten similar numbers. (In a future blog post I’ll share a new experiment design I’ve been working on.)
- resources got evicted – hmmmmm
OK, let’s talk about eviction for a minute. The two biggest influencers for a resource getting evicted are the size of the cache and the eviction algorithm. It turns out, the amount of disk space used for caching hasn’t kept pace with the size of people’s drives and their use of the Web. Here are the default disk cache sizes for the major browsers:
- Internet Explorer: 8-50 MB
- Firefox: 50 MB
- Safari: everything I found said there isn’t a max size setting (???)
- Chrome: < 80 MB (varies depending on available disk space)
- Opera: 20 MB
Those defaults are too small. My disk drive is 150 GB of which 120 GB is free. I’d gladly give up 5 GB or more to raise the odds of web pages loading faster.
Even with more disk space, the cache is eventually going to fill up. When that happens, cached resources need to be evicted to make room for the new ones. Here’s where eviction algorithms come into play. Most eviction algorithms are LRU-based – the resource that was least recently used is evicted. However, our knowledge of performance pain points has grown dramatically in the last few years. Translating this knowledge into eviction algorithm improvements makes sense. For example, we’re all aware how much costlier it is to download a script than an image. (Scripts block other downloads and rendering.) Scripts, therefore, should be given a higher priority when it comes to caching.
It’s hard to get access to gather browser disk cache stats, so I’m asking people to discover their own settings and share them via the Browser Disk Cache Survey form. I included this in my talks at JSConf and jQueryConf. ~150 folks at those conferences filled out the form. The data shows that 55% of people surveyed have a cache that’s over 90% full. (Caveats: this is a small sample size and the data is self-reported.) It would be great if you would take time to fill out the form. I’ve also started writing instructions for finding your cache settings.
I’m optimistic about the potential speedup that could result from improving browser caching, and fortunately browser vendors seem receptive (for example, the recent Mozilla Caching Summit). I expect we’ll see better default cache sizes and eviction logic in the next major release of each browser. Until then, jack up your defaults as described in the instructions. And please add comments for any browsers I left out or got wrong. Thanks.
Browser Performance Wishlist
What are the most important changes browsers could make to improve performance?
This document is my answer to that question. This is mainly for browser developers, although web developers will want to track the adoption of these improvements.
- download scripts without blocking
- SCRIPT attributes
- resource packages
- border-radius
- cache redirects
- link prefetch
- Web Timing spec
- remote JS debugging
- Web Sockets
- History
- anchor ping
- progressive XHR
- stylesheet & inline JS
- SCRIPT DEFER for inline scripts
- @import improvements
- @font-face improvements
- stylesheets & iframes
- paint events
- missing schema, double downloads
Before digging into the list I wanted to mention two items that would actually be at the top of the list if it wasn’t for how new they are: SPDY and FRAG tag. Both of these require industry adoption and possible changes to specifications, so it’s too soon to put them on an implementation wishlist. I hope these ideas gain consensus soon and to facilitate that I describe them here.
- SPDY
- SPDY is a proposal from Google for making three major improvements to HTTP: compressed headers, multiplexed requests, and prioritized responses. Initial studies showed 25 top sites were loaded 55% faster. Server and client implementations are available, and some other organizations and individuals have completed server and client implementations. The protocol draft has been published for review.
- FRAG tag
- The idea behind this “document fragment” tag is that it be used to wrap 3rd party content – ads, widgets, and analytics. 3rd party content can have a severe impact on the containing page’s performance due to additional HTTP requests, scripts that block rendering and downloads, and added DOM nodes. Many of these factors can be mitigated by putting the 3rd party content inside an iframe embedded in the top level HTML document. But iframes have constraints and drawbacks – they typically introduce another HTTP request for the iframe’s HTML document, not all 3rd party code snippets will work inside an iframe without changes (e.g., references to “document” in JavaScript might need to reference the parent document), and some snippets (expando ads, suggest) can’t float over the main page’s elements. Another path to mitigate these issues is to load the JavaScript asynchronously, but many of these widgets use document.write and so must be evaluated synchronously.A compromise is to place 3rd party content in the top level HTML document wrapped in a FRAG block. This approach degrades nicely – older browsers would ignore the FRAG tag and handle these snippets the same way they do today. Newer browsers would parse the HTML in a separate document fragment. The FRAG content would not block the rendering of the top level document. Snippets containing document.write would work without blocking the top level document. This idea just started getting discussed in January 2010. Much more use case analysis and discussion is needed, culminating in a proposed specification. (Credit to Alex Russell for the idea and name.)
The List
The performance wishlist items are sorted highest priority first. The browser icons indicate which browsers need to implement that particular improvement.
- download scripts without blocking
- In older browsers, once a script started downloading all subsequent downloads were blocked until the script returned. It’s critical that scripts be evaluated in the order specified, but they can be downloaded in parallel. This has a significant improvement on page load times, especially for pages with multiple scripts. Newer browsers (IE8, Firefox 3.5+, Safari 4, Chrome 2+) incorporated this parallel script loading feature, but it doesn’t work as proactively as it could. Specifically:
- IE8 – downloading scripts blocks image and iframe downloads
- Firefox 3.6 – downloading scripts blocks iframe downloads
- Safari 4 – downloading scripts blocks iframe downloads
- Chrome 4 – downloading scripts blocks iframe downloads
- Opera 10.10 – downloading scripts blocks all downloads
(test case, see the four “|| Script [Script|Stylesheet|Image|Iframe]” tests)
- SCRIPT attributes
- The HTML5 specification describes the ASYNC and DEFER attributesfor the SCRIPT tag, but the implementation behavior is not specified. Here’s how the SCRIPT attributes should work.
- DEFER – The HTTP request for a SCRIPT with the DEFER attribute is not made until all other resources in the page on the same domain have already been sent. This is so that it doesn’t occupy one of the limited number of connections that are opened for a single server. Deferred scripts are downloaded in parallel, but are executed in the order they occur in the HTML document, regardless of what order the responses arrive in. The window’s onload event fires after all deferred scripts are downloaded and executed.
- ASYNC – The HTTP request for a SCRIPT with the ASYNC attribute is made immediately. Async scripts are executed as soon as the response is received, regardless of the order they occur in the HTML document. The window’s onload event fires after all async scripts are downloaded and executed.
- POSTONLOAD – This is a new attribute I’m proposing. Postonload scripts don’t start downloading until after the window’s onload event has fired. By default, postonload scripts are evaluated in the order they occur in the HTML document. POSTONLOAD and ASYNC can be used in combination to cause postonload scripts to be evaluated as soon as the response is received, regardless of the order they occur in the HTML document.
- resource packages
- Each HTTP request has some overhead cost. Workarounds include concatenating scripts, concatenating stylesheets, and creating image sprites. But this still results in multiple HTTP requests. And sprites are especially difficult to create and maintain. Alexander Limi (Mozilla) has proposed using zip files to create resource packages. It’s a good idea because of its simplicity and graceful degradation.
- border-radius
- Creating rounded corners leads to code bloat and excessive HTTP requests. Border-radius reduces this to a simple CSS style. The only major browser that doesn’t support border-radius is IE. It has already been announced that IE9 will support border-radius, but I wanted to include it nevertheless.
- cache redirects
- Redirects are costly from a performance perspective, especially for users with high latency. Although the HTTP specsays 301 and 302 responses (with the proper HTTP headers) are cacheable, most browsers don’t support this.
- IE8 – doesn’t cache redirects for the main page and for resources
- Safari 4 – doesn’t cache redirects for the main page
- Opera 10.10 – doesn’t cache redirects for the main page
- link prefetch
- To improve page load times, developers prefetch resources that are likely or certain to be used later in the user’s session. This typically involves writing JavaScript code that executes after the onload event. When prefetching scripts and stylesheets, an iframe must be used to avoid conflict with the JavaScript and CSS in the main page. Using an iframe makes this prefetching code more complex. A final burden is the processing required to parse prefetched scripts and stylesheets. The browser UI can freeze while prefetched scripts and stylesheets are parsed, even though this is unnecessary as they’re not going to be used in the current page. A simple alternative solution is to use LINK PREFETCH. Firefox is the only major browser that supports this feature (since 1.0). Wider support of LINK PREFETCH would give developers an easy way to accelerate their web pages. (test case)
- Web Timing spec
- In order for web developers to improve the performance of their web sites, they need to be able to measure their performance – specifically their page load times. There’s debate on the endpoint for measuring page load times (window onload event, first paint event, onDomReady), but most people agree that the starting point is when the web page is requested by the user. And yet, there is no reliable way for the owner of the web page to measure from this starting point. Google has submitted the Web Timing proposal draft for browser builtin support for measuring page load times to address these issues.
- remote JS debugging
- Developers strive to make their web apps fast across all major browsers, but this requires installing and learning a different toolset for each browser. In order to get cross-browser web development tools, browsers need to support remote JavaScript debugging. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. Agreement on the preferred protocol and support in the major browsers would go a long way to getting faster web apps for all users, and reducing the work for developers to maintain cross-browser web app performance.
- Web Sockets
- HTML5 Web Sockets provide built-in support for two-way communications between the client and server. The communication channel is accessible via JavaScript. Web Sockets are superior to comet and Ajax, especially in their compatibility with proxies and firewalls, and provide a path for building web apps with a high degree of communication between the browser and server.
- History
- HTML5 specifies implementation for History.pushState and History.replaceState. With these, web developers can dynamically change the URL to reflect the web application state without having to perform a page transition. This is important for Web 2.0 applications that modify the state of the web page using Ajax. Being able to avoid fetching a new HTML document to reflect these application changes results in a faster user experience.
- anchor ping
- The ping attribute for anchors provides a more performant way to track links. This is a controversial feature because of the association with “tracking” users. However, links are tracked today, it’s just done in a way that hurts the user experience. For example, redirects, synchronous XHR, and tight loops in unload handlers are some of the techniques used to ensure clicks are properly recorded. All of these create a slower user experience.
- progressive XHR
- The draft spec for XMLHttpRequest details how XHRs are to support progressive response handling. This is important for web apps that use data with varied response times as well as comet-style applications. (more information)
- stylesheet & inline JS
- When a stylesheet is followed by an inline script, resources that follow are blocked until the stylesheet is downloaded and the inline script is evaluated. Browsers should instead lookahead in their parsing and start downloading subsequent resources in parallel with the stylesheet. These resources of course would not be rendered, parsed, or evaluated until after the stylesheet was parsed and the inline script was evaluated. (test case see “|| CSS + Inline Script”; looks like this just landed in Firefox 3.6!)
- SCRIPT DEFER for inline scripts
- The benefit of the SCRIPT DEFER attribute for external scripts is discussed above. But DEFER is also useful for inline scripts that can be executed after the page has been parsed. Currently, IE8 supports this behavior. (test case)
- @import improvements
- @import is a popular alternative to the LINK tag for loading stylesheets, but it has several performance problems in IE:
- LINK @import – If the first stylesheet is loaded using LINK and the second one uses @import, they are loaded sequentially instead of in parallel. (test case)
- LINK blocks @import – If the first stylesheet is loaded using LINK, and the second stylesheet is loaded using LINK that contains @import, that @import stylesheet is blocked from downloading until the first stylesheet response is received. It would be better to start downloading the @import stylesheet immediately. (test case)
- many @imports – Using @import can change the download sequence of resources. In this test case, multiple stylesheets loaded with @import are followed by a script. Even though the script is listed last in the HTML document, it gets downloaded first. If the script takes a long time to download, it can causes the stylesheet downloads to be delayed, which can cause rendering to be delayed. It would be better to follow the order specified in the HTML document. (test case)
- @font-face improvements
- In IE8, if a script occurs before a style that uses @font-face, the page is blocked from rendering until the font file is done downloading. It would be better to render the rest of the page without waiting for the font file. (test case, blog post)
- stylesheets & iframes
- When an iframe is preceded by an external stylesheet, it blocks iframe downloads. In IE, the iframe is blocked from downloading until the stylesheet response is received. In Firefox, the iframe’s resources are blocked from downloading until the stylesheet response is received. There’s no dependency between the parent’s stylesheet and the iframe’s HTML document, so this blocking behavior should be removed. (test case)
- paint events
- As the amount of DOM elements and CSS grows, it’s becoming more important to be able to measure the performance of painting the page. Firefox 3.5 added the MozAfterPaint event which opened the door for add-ons like Firebug Paint Events (although early Firefox documentation noted that the “event might fire before the actual repainting happens“). Support for accurate paint events will allow developers to capture these metrics.
- missing schema, double downloads
- In IE7&8, if the “http:” schema is missing from a stylesheet’s URL, the stylesheet is downloaded twice. This makes the page render more slowly. Not including “http://” in URLs is not pervasive, but it’s getting more widely adopted because it reduces download size and resolves to “http://” or “https://” as appropriate. (test case)