Prebrowsing

November 7, 2013 2:41 pm | 20 Comments

A favorite character from the MASH TV series is Corporal Walter Eugene O’Reilly, fondly referred to as “Radar” for his knack of anticipating events before they happen. Radar was a rare example of efficiency because he was able to carry out Lt. Col. Blake’s wishes before Blake had even issued the orders.

What if the browser could do the same thing? What if it anticipated the requests the user was going to need, and could complete those requests ahead of time? If this was possible, the performance impact would be significant. Even if just the few critical resources needed were already downloaded, pages would render much faster.

Browser cache isn’t enough

You might ask, “isn’t this what the cache is for?” Yes! In many cases when you visit a website the browser avoids making costly HTTP requests and just reads the necessary resources from disk cache. But there are many situations when the cache offers no help:

  • first visit – The cache only comes into play on subsequent visits to a site. The first time you visit a site it hasn’t had time to cache any resources.
  • cleared – The cache gets cleared more than you think. In addition to occasional clearing by the user, the cache can also be cleared by anti-virus software and browser bugs. (19% of Chrome users have their cache cleared at least once a week due to a bug.)
  • purged – Since the cache is shared by every website the user visits, it’s possible for one website’s resources to get purged from the cache to make room for another’s.
  • expired69% of resources don’t have any caching headers or are cacheable for less than one day. If the user revisits these pages and the browser determines the resource is expired, an HTTP request is needed to check for updates. Even if the response indicates the cached resource is still valid, these network delays still make pages load more slowly, especially on mobile.
  • revved – Even if the website’s resources are in the cache from a previous visit, the website might have changed and uses different resources.

Something more is needed.

Prebrowsing techniques

In their quest to make websites faster, today’s browsers offer a number of features for doing work ahead of time. These “prebrowsing” (short for “predictive browsing” – a word I made up and a domain I own) techniques include:

  • <link rel="dns-prefetch" ...>
  • <link rel="prefetch" ...>
  • <link rel="prerender" ...>
  • DNS pre-resolution
  • TCP pre-connect
  • prefreshing
  • the preloader

These features come into play at different times while navigating web pages. I break them into these three phases:

  1. previous page – If a web developer has high confidence about which page you’ll go to next, they can use LINK REL dns-prefetch, prefetch or prerender on the previous page to finish some work needed for the next page.
  2. transition – Once you navigate away from the previous page there’s a transition period after the previous page is unloaded but before the first byte of the next page arrives. During this time the web developer doesn’t have any control, but the browser can work in anticipation of the next page by doing DNS pre-resolution and TCP pre-connects, and perhaps even prefreshing resources.
  3. current page – As the current page is loading, browsers have a preloader that scans the HTML for downloads that can be started before they’re needed.

Let’s look at each of the prebrowsing techniques in the context of each phase.

Phase 1 – Previous page

As with any of this anticipatory work, there’s a risk that the prediction is wrong. If the anticipatory work is expensive (e.g., steals CPU from other processes, consumes battery, or wastes bandwidth) then caution is warranted. It would seem difficult to anticipate which page users will go to next, but high confidence scenarios do exist:

  • If the user has done a search with an obvious result, that result page is likely to be loaded next.
  • If the user navigated to a login page, the logged-in page is probably coming next.
  • If the user is reading a multi-page article or paginated set of results, the page after the current page is likely to be next.

Let’s take the example of searching for Adventure Time to illustrate how different prebrowsing techniques can be used.

DNS-PREFETCH

If the user searched for Adventure Time then it’s likely the user will click on the result for Cartoon Network, in which case we can prefetch the DNS like this:

<link rel="dns-prefetch" href="//cartoonnetwork.com">

DNS lookups are very low cost – they only send a few hundred bytes over the network – so there’s not a lot of risk. But the upside can be significant. This study from 2008 showed a median DNS lookup time of ~87 ms and a 90th percentile of ~539 ms. DNS resolutions might be faster now. You can see your own DNS lookup times by going to chrome://histograms/DNS (in Chrome) and searching for the DNS.PrefetchResolution histogram. Across 1325 samples my median is 50 ms with an average of 236 ms – ouch!

In addition to resolving the DNS lookup, some browsers may go one step further and establish a TCP connection. In summary, using dns-prefetch can save a lot of time, especially for redirects and on mobile.

PREFETCH

If we’re more confident that the user will navigate to the Adventure Time page and we know some of its critical resources, we can download those resources early using prefetch:

<link rel="prefetch" href="http://cartoonnetwork.com/utils.js">

This is great, but the spec is vague, so it’s not surprising that browser implementations behave differently. For example,

  • Firefox downloads just one prefetch item at a time, while Chrome prefetches up to ten resources in parallel.
  • Android browser, Firefox, and Firefox mobile start prefetch requests after window.onload, but Chrome and Opera start them immediately possibly stealing TCP connections from more important resources needed for the current page.
  • An unexpected behavior is that all the browsers that support prefetch cancel the request when the user transitions to the next page. This is strange because the purpose of prefetch is to get resources for the next page, but there might often not be enough time to download the entire response. Canceling the request means the browser has to start over when the user navigates to the expected page. A possible workaround is to add the “Accept-Ranges: bytes” header so that browsers can resume the request from where it left off.

It’s best to prefetch the most important resources in the page: scripts, stylesheets, and fonts. Only prefetch resources that are cacheable – which means that you probably should avoid prefetching HTML responses.

PRERENDER

If we’re really confident the user is going to the Adventure Time page next, we can prerender the page like this:

<link rel="prerender" href="http://cartoonnetwork.com/">

This is like opening the URL in a hidden tab – all the resources are downloaded, the DOM is created, the page is laid out, the CSS is applied, the JavaScript is executed, etc. If the user navigates to the specified href, then the hidden page is swapped into view making it appear to load instantly. Google Search has had this feature for years under the name Instant Pages. Microsoft recently announced they’re going to similarly use prerender in Bing on IE11.

Many pages use JavaScript for ads, analytics, and DHTML behavior (start a slideshow, play a video) that don’t make sense when the page is hidden. Website owners can workaround this issue by using the page visibility API to only execute that JavaScript once the page is visible.

Support for dns-prefetch, prefetch, and prerender is currently pretty spotty. The following table shows the results crowdsourced from my prebrowsing tests. You can see the full results here. Just as the IE team announced upcoming support for prerender, I hope other browsers will see the value of these features and add support as well.

dns-prefetch prefetch prerender
Android 4 4
Chrome 22+ 31+1 22+
Chrome Mobile 29+
Firefox 22+2 23+2
Firefox Mobile 24+ 24+
IE 113 113 113
Opera 15+
  • 1 Need to use the --prerender=enabled commandline option.
  • 2 My friend at Mozilla said these features have been present since version 12.
  • 3 This is based on a Bing blog post. It has not been tested.

Ilya Grigorik‘s High Performance Networking in Google Chrome is a fantastic source of information on these techniques, including many examples of how to see them in action in Chrome.

Phase 2 – Transition

When the user clicks a link the browser requests the next page’s HTML document. At this point the browser has to wait for the first byte to arrive before it can start processing the next page. The time-to-first-byte (TTFB) is fairly long – data from the HTTP Archive in BigQuery indicate a median TTFB of 561 ms and a 90th percentile of 1615 ms.

During this “transition” phase the browser is presumably idle – twiddling its thumbs waiting for the first byte of the next page. But that’s not so! Browser developers realized that this transition time is a HUGE window of opportunity for performance prebrowsing optimizations. Once the browser starts requesting a page, it doesn’t have to wait for that page to arrive to start working. Just like Radar, the browser can anticipate what will need to be done next and can start that work ahead of time.

DNS pre-resolution & TCP pre-connect

The browser doesn’t have a lot of context to go on – all it knows is the URL being requested, but that’s enough to do DNS pre-resolution and TCP pre-connect. Browsers can reference prior browsing history to find clues about the DNS and TCP work that’ll likely be needed. For example, suppose the user is navigating to http://cartoonnetwork.com/. From previous history the browser can remember what other domains were used by resources in that page. You can see this information in Chrome at chrome://dns. My history shows the following domains were seen previously:

  • ads.cartoonnetwork.com
  • gdyn.cartoonnetwork.com
  • i.cdn.turner.com

During this transition (while it’s waiting for the first byte of Cartoon Network’s HTML document to arrive) the browser can resolve these DNS lookups. This is a low cost exercise that has significant payoffs as we saw in the earlier dns-prefetch discussion.

If the confidence is high enough, the browser can go a step further and establish a TCP connection (or two) for each domain. This will save time when the HTML document finally arrives and requires page resources. The Subresource PreConnects column in chrome://dns indicates when this occurs. For more information about dns-presolution and tcp-preconnect see DNS Prefetching.

Prefresh

Similar to the progression from LINK REL dns-prefetch to prefetch, the browser can progress from DNS lookups to actual fetching of resources that are likely to be needed by the page. The determination of which resources to fetch is based on prior browsing history, similar to what is done in DNS pre-resolution. This is implemented as an experimental feature in Chrome called “prefresh” that can be turned on using the --speculative-resource-prefetching="enabled" flag. You can see the resources that are predicted to be needed for a given URL by going to chrome://predictors and clicking on the Resource Prefetch Predictor tab.

The resource history records which resources were downloaded in previous visits to the same URL, how often the resource was hit as well as missed, and a score for the likelihood that the resource will be needed again. Based on these scores the browser can start downloading critical resources while it’s waiting for the first byte of the HTML document to arrive. Prefreshed resources are thus immediately available when the HTML needs them without the delays to fetch, read, and preprocess them. The implementation of prefresh is still evolving and being tested, but it holds potential to be another prebrowsing timesaver that can be utilized during the transition phase.

Phase 3 – Current Page

Once the current page starts loading there’s not much opportunity to do prebrowsing – the user has already arrived at their destination. However, given that the average page takes 6+ seconds to load, there is a benefit in finding all the necessary resources as early as possible and downloading them in a prioritized order. This is the role of the preloader.

Most of today’s browsers utilize a preloader – also called a lookahead parser or speculative parser. The preloader is, in my opinion, the most important browser performance optimization ever made. One study found that the preloader alone improved page load times by ~20%. The invention of preloaders was in response to the old browser behavior where scripts were downloaded one-at-a-time in daisy chain fashion.

Starting with IE 8, parsing the HTML document was modified such that it forked when an external SCRIPT SRC tag was hit: the main parser is blocked waiting for the script to download and execute, but the lookahead parser continues parsing the HTML only looking for tags that might generate HTTP requests (IMG, SCRIPT, LINK, IFRAME, etc.). The lookahead parser queues these requests resulting in a high degree of parallelized downloads. Given that the average web page today has 17 external scripts, you can imagine what page load times would be like if they were downloaded sequentially. Being able to download scripts and other requests in parallel results in much faster pages.

The preloader has changed the logic of how and when resources are requested. These changes can be summarized by the goal of loading critical resources (scripts and stylesheets) early while loading less critical resources (images) later. This simple goal can produce some surprising results that web developers should keep in mind. For example:

  • JS responsive images get queued last – I’ve seen pages that had critical (bigger) images that were loaded using a JavaScript responsive images technique, while less critical (smaller) images were loaded using a normal IMG tag. Most of the time I see these images being downloaded from the same domain. The preloader looks ahead for IMG tags, sees all the less critical images, and adds those to the download queue for that domain. Later (after DOMContentLoaded) the JavaScript responsive images technique kicks in and adds the more critical images to the download queue – behind the less critical images! This is often not the expected nor desired behavior.
  • scripts “at the bottom” get loaded “at the top” – A rule I promoted starting in 2007 is to move scripts to the bottom of the page. In the days before preloaders this would ensure that all the requests higher in the page, including images, got downloaded first – a good thing when the scripts weren’t needed to render the page. But most preloaders give scripts a higher priority than images. This can result in a script at the bottom stealing a TCP connection from an image higher in the page causing above-the-fold rendering to take longer.

When it comes to the preloader the bottomline is that the preloader is a fantastic performance optimization for browsers, but the logic is new and still evolving so web developers should be aware of how the preloader works and watch their pages for any unexpected download behavior.

As the low hanging fruit of web performance optimization is harvested, we have to look harder to find the next big wins. Prebrowsing is an area that holds a lot of potential to deliver pages instantly. Web developers and browser developers have the tools at their disposal and some are taking advantage of them to create these instant experiences. I hope we’ll see even wider browser support for these prebrowsing features, as well as wider adoption by web developers.

[Here are the slides and video of my Prebrowsing talk from Velocity New York 2013.]

 

20 Comments

Reloading post-onload resources

February 26, 2013 5:35 pm | 16 Comments

Two performance best practices are to add a far future expiration date and to delay loading resources (esp. scripts) until after the onload event. But it turns out that the combination of these best practices leads to a situation where it’s hard for users to refresh resources. More specifically, hitting Reload (or even shift+Reload) doesn’t refresh these cacheable, lazy-loaded resources in Firefox, Chrome, Safari, Android, and iPhone.

What we expect from Reload

The browser has a cache (or 10) where it saves copies of responses. If the user feels those cached responses are stale, she can hit the Reload button to ignore the cache and refetch everything, thus ensuring she’s seeing the latest copy of the website’s content. I couldn’t find anything in the HTTP Spec dictating the behavior of the Reload button, but all browsers have this behavior AFAIK:

  • If you click Reload (or control+R or command+R) then all the resources are refetched using a Conditional GET request (with the If-Modified-Since and If-None-Match validators). If the server’s version of the response has not changed, it returns a short “304 Not Modified” status with no response body. If the response has changed then “200 OK” and the entire response body is returned.
  • If you click shift+Reload (or control+Reload or control+shift+R or command+shift+R) then all the resources are refetched withOUT the validation headers. This is less efficient since every response body is returned, but guarantees that any cached responses that are stale are overwritten.

Bottomline, regardless of expiration dates we expect that hitting Reload gets the latest version of the website’s resources, and shift+Reload will do so even more aggressively.

Welcome to Reload 2.0

In the days of Web 1.0, resources were requested using HTML markup – IMG, SCRIPT, LINK, etc. With Web 2.0 resources are often requested dynamically. Two common examples are loading scripts asynchronously (e.g., Google Analytics) and dynamically fetching images (e.g., for photo carousels or images below-the-fold). Sometimes these resources are requested after window onload so that the main page can render quickly for a better user experience, better metrics, etc. If these resources have a far future expiration date, the browser needs extra intelligence to do the right thing.

  • If the user navigates to the page normally (clicking on a link, typing a URL, using a bookmark, etc.) and the dynamic resource is in the cache, the browser should use the cached copy (assuming the expiration date is still in the future).
  • If the user reloads the page, the browser should refetch all the resources including resources loaded dynamically in the page.
  • If the user reloads the page, I would think resources loaded in the onload handler should also be refetched. These are likely part of the basic construction of the page and they should be refetched if the user wants to refresh the page’s contents.
  • But what should the browser do if the user reloads the page and there are resources loaded after the onload event? Some web apps are long lived with sessions that last hours or even days. If the user does a reload, should every dynamically-loaded resource for the life of the web app be refetched ignoring the cache?

An Example

Let’s look at an example: Postonload Reload.

This page loads an image and a script using five different techniques:

  1. markup – The basic HTML approach: <img src= and <script src=.
  2. dynamic in body – In the body of the page is a script block that creates an image and a script element dynamically and sets the SRC causing the resource to be fetched. This code executes before onload.
  3. onload – An image and a script are dynamically created in the onload handler.
  4. 1 ms post-onload – An image and a script are dynamically created via a 1 millisecond setTimeout callback in the onload handler.
  5. 5 second post-onload – An image and a script are dynamically created via a 5 second setTimeout callback in the onload handler.

All of the images and scripts have an expiration date one month in the future. If the user hits Reload, which of the techniques should result in a refetch? Certainly we’d expect techniques 1 & 2 to cause a refetch. I would hope 3 would be refetched. I think 4 should be refetched but doubt many browsers do that, and 5 probably shouldn’t be refetched. Settle on your expected results and then take a look at the table below.

The Results

Before jumping into the Reload results, let’s first look at what happens if the user just navigates to the page. This is achieved by clicking on the “try again” link in the example. In this case none of the resources are refetched. All of the resources have been saved to the cache with an expiration date one month in the future, so every browser I tested just reads them from cache. This is good and what we would expect.

But the behavior diverges when we look at the Reload results captured in the following table.

Table 1. Resources that are refetched on Reload
technique resource Chrome 25 Safari 6 Android Safari/534 iPhone Safari/7534 Firefox 19 IE 8,10 Opera 12
markup image 1 Y Y Y Y Y Y Y
script 1 Y Y Y Y Y Y Y
dynamic image 2 Y Y Y Y Y Y Y
script 2 Y Y Y Y Y Y Y
onload image 3 Y Y Y
script 3 Y Y
1ms postonload image 4 Y
script 4 Y
5sec postonload image 5
script 5

The results for Chrome, Safari, Android mobile Safari, and iPhone mobile Safari are the same. When you click Reload in these browsers the resources in the page get refetched (resources 1&2), but not so for the resources loaded in the onload handler and later (resources 3-5).

Firefox is interesting. It loads the four resources in the page plus the onload handler’s image (image 3), but not the onload handler’s script (script 3). Curious.

IE 8 and 10 are the same: they load the four resources in the page as well as the image & script from the onload handler (resources 1-3). I didn’t test IE 9 but I assume it’s the same.

Opera has the best results in my opinion. It refetches all of the resources in the main page, the onload handler, and 1 millisecond after onload (resources 1-4), but it does not refetch the resources 5 seconds after onload (image 5 & script 5). I poked at this a bit. If I raise the delay from 1 millisecond to 50 milliseconds, then image 4 & script 4 are not refetched. I think this is a race condition where if Opera is still downloading resources from the onload handler when these first delayed resources are created, then they are also refetched. To further verify this I raised the delay to 500 milliseconds and confirmed the resources were not refetched, but then increased the response time of all the resources to 1 second (instead of instantaneous) and this caused image 4 & script 4 to be refetched, even though the delay was 500 milliseconds after onload.

Note that pressing shift+Reload (and other combinations) didn’t alter the results.

Takeaways

A bit esoteric? Perhaps. This is a deep dive on a niche issue, I’ll grant you that. But I have a few buts:

If you’re a web developer using far future expiration dates and lazy loading, you might get unexpected results when you change a resource and hit Reload, and even shift+Reload. If you’re not getting the latest version of your dev resources you might have to clear your cache.

This isn’t just an issue for web devs. It affects users as well. Numerous sites lazy-load resources with far future expiration dates including 8 of the top 10 sites: Google, YouTube, Yahoo, Microsoft Live, Tencent QQ, Amazon, and Twitter. If you Reload any of these sites with a packet sniffer open in the first four browsers listed, you’ll see a curious pattern: cacheable resources loaded before onload have a 304 response status, while those after onload are read from cache and don’t get refetched. The only way to ensure you get a fresh version is to clear your cache, defeating the expected benefit of the Reload button.

Here’s a waterfall showing the requests when Amazon is reloaded in Chrome. The red vertical line marks the onload event. Notice how the resources before onload have 304 status codes. Right after the onload are some image beacons that aren’t cacheable, so they get refetched and return 200 status codes. The cacheable images loaded after onload are all read from cache, so any updates to those resources are missed.

Finally, whenever behavior varies across browsers it’s usually worthwhile to investigate why. Often one behavior is preferred over another, and we should get the specs and vendors aligned in that direction. In this case, we should make Reload more consistent and have it refetch resources, even those loaded dynamically in the onload handler.

16 Comments

Clearing Browser Data

September 10, 2012 7:50 pm | 13 Comments

In Keys to a Fast Web App I listed caching as a big performance win. I’m going to focus on caching for the next few months. The first step is a study I launched a few days ago called the Clear Browser Experiment. Before trying to measure the frequency and benefits of caching, I wanted to start by gauging what happens when users clear their cache. In addition to the browser disk cache, I opened this up to include other kinds of persistent data: cookies, localStorage, and application cache. (I didn’t include indexedDB because it’s less prevalent.)

Experiment Setup

The test starts by going to the Save Stuff page. That page does four things:

  • Sets a persistent cookie called “cachetest”. The cookie is created using JavaScript.
  • Tests window.localStorage to see if localStorage is supported. If so, saves a value to localStorage with the key “cachetest”.
  • Tests window.applicationCache to see if appcache is supported. If so, loads an iframe that uses appcache. The iframe’s manifest file has a fallback section containing “iframe404.js iframe.js”. Since iframe404.js doesn’t exist, appcache should instead load iframe.js which defines the global variable iframejsNow.
  • Loads an image that takes 5 seconds to return. The image is cacheable for 30 days.

After this data is saved to the browser, the user is prompted to clear their browser and proceed to the Check Cache page. This page checks to see if the previous items still exist:

  • Looks for the cookie called “cachetest”.
  • Checks localStorage for the key “cachetest”.
  • Loads the iframe again. The iframe’s onload handler checks if iframejsNow is defined in which case appcache was not cleared.
  • Loads the same 5-second image again. The image’s onload handler checks if it takes more than 5 seconds to return, in which case disk cache was cleared.

My Results

I created a Browserscope user test to store the results. (If you haven’t used this feature you should definitely check it out. Jason Grigsby is glad he did.) This test is different from my other tests because it requires the user to do an action. Because of this I ran the test on various browsers to create a “ground truth” table of results. Green and “1” indicate the data was cleared successfully while red and “0” indicate it wasn’t. Blank means the feature wasn’t supported.

The results vary but are actually more consistent than I expected. Some observations:

  • Chrome 21 doesn’t clear localStorage. This is perhaps an aberrant result due to the structure of the test. Chrome 21 clears localStorage but does not clear it from memory in the current tab. If you switch tabs or restart Chrome the result is cleared. Nevertheless, it would be better to clear it from memory as well. The Chrome team has already fixed this bug as evidenced by the crowdsourced results for Chrome 23.0.1259 and later.
  • Firefox 3.6 doesn’t clear disk cache. The disk cache issue is similar to Chrome 21’s story: the image is cleared from disk cache, but not from memory cache. Ideally both would be cleared and the Firefox team fixed this bug back in 2010.
  • IE 6-7 don’t support appcache nor localStorage.
  • IE 8-9 don’t support appcache.
  • Firefox 3.6, IE 8-9, and Safari 5.0.5 don’t clear localStorage. My hypothesis for this result is that there is no UI attached to localStorage. See the following section on Browser UIs for screenshots from these browsers.

Browser UIs

Before looking at the crowdsourced results, it’s important to see what kind of challenges the browser UI presents to clearing data. This is also a handy, although lengthy, guide to clearing data for various browsers. (These screenshots and menu selections were gathered from both Mac and Windows so you might see something different.)

Chrome

Clicking on the wrench icon -> History -> Clear all browsing data… in Chrome 21 displays this dialog. Checking “Empty the cache” clears the disk cache, and “Delete cookies and other site and plug-in data” clears cookies, localStorage, and appcache.

Firefox

To clear Firefox 3.6 click on Tools -> Clear Recent History… and check Cookies and Cache. To be extra aggressive I also checked Site Preferences, but none of these checkboxes cause localStorage to be cleared.

Firefox 12 fixed this issue by adding a checkbox for Offline Website Data. Firefox 15 has the same choices. As a result localStorage is cleared successfully.

Internet Explorer

It’s a bit more work to clear IE 6. Clicking on Tools -> Internet Options… -> General presents two buttons: Delete Cookies… and Delete Files… Both need to be selected and confirmed to clear the browser. There’s an option to “Delete all offline content” but since appcache and localStorage aren’t supported this isn’t relevant to this experiment.

Clearing IE 7 is done by going to Tools -> Delete Browsing History… There are still separate buttons for deleting files and deleting cookies, but there’s also a “Delete all…” button to accomplish everything in one action.

The clearing UI for IE 8 is reached via Tools -> Internet Options -> General -> Delete… It has one additional checkbox for clearing data compared to IE 7: “InPrivate Filtering data”. (It also has “Preserve Favorites website data” which I’ll discuss in a future post.) LocalStorage is supported in IE 8, but according to the results it’s not cleared and now we see a likely explanation: there’s no checkbox for explicitly clearing offline web data (as there is in Firefox 12+).

IE 9’s clearing UI changes once again. “Download History” is added, and “InPrivate Filtering data” is replaced with “ActiveX Filtering and Tracking Protection data”. Similar to IE 8, localStorage is supported in IE 9, but according to the results it’s not cleared and the likely explanation is that there is not a checkbox for explicitly clearing offline web data (as there is in Firefox 12+).

iPhone

The iPhone (not surprisingly) has the simplest UI for clearing browser data. Going through Settings -> Safari we find a single button: “Clear Cookies and Data”. Test results show that this clears cookies, localStorage, appcache, and disk cache. It’s hard to run my test on the iPhone because you have to leave the browser to get to Settings, so when you return to the browser the test page has been cleared. I solve this by typing in the URL for the next page in the test: https://stevesouders.com/tests/clearbrowser/check.php.

Opera

Opera has the most granularity in what to delete. In Opera 12 going to Tools -> Delete Private Data… displays a dialog box. The key checkboxes for this test are “Delete all cookies”, “Delete entire cache”, and “Delete persistent storage”. There’s a lot to choose from but it’s all in one dialog.

Safari

In Safari 5.0.5 going to gear icon -> Reset Safari… (on Windows) displays a dialog box with many choices. None of them address “offline data” explicitly which is likely why the results show that localStorage is not cleared. (“Clear history” has no effect – I just tried it to see if it would clear localStorage.)

Safari 5.1.7 was confusing for me. At first I chose Safari -> Empty cache… (on Mac) but realized this only affected disk cache. I also saw Safari -> Reset Safari… but this only had “Remove all website data” which seemed too vague and broad. I went searching for more clearing options and found Safari -> Preferences… -> Privacy with a button to “Remove All Website Data…” but this also had no granularity of what to clear. This one button did successfully clear cookies, localStorage, appcache, and disk cache.

Crowdsourced Results

The beauty of Browserscope user tests is that you can crowdsource results. This allows you to gather results for browsers and devices that you don’t have access to. The crowdsourced results for this experiment include ~100 different browsers including webOS, Blackberry, and RockMelt. I extracted results for the same major browsers shown in the previous table.

In writing up the experiment I intentionally used the generic wording “clear browser” in an attempt to see what users would select without prompting. The crowdsourced results match my personal results fairly closely. Since this test required users to take an action without a way to confirm that they actually did try to clear their cache, I extended the table to show the percentage of tests that reflect the majority result.

One difference between my results and the crowdsourced results is iPhone – I think this is due to the fact that clearing the cache requires leaving the browser thus disrupting the experiment. The other main difference is Safari 5. The sample size is small so we shouldn’t draw strong conclusions. It’s possible that having multiple paths for clearing data caused confusion.

The complexity and lack of consistency in UIs for clearing data could be a cause for these less-than-unanimous results. Chrome 21 and Firefox 15 both have a fair number of tests (154 and 46), and yet some of the winning results are only seen 50% or 68% of the time. Firefox might be a special case because it prompts before saving information to localStorage. And this test might be affected by private browsing such as Chrome’s incognito mode. Finally, it’s possible test takers didn’t want to clear their cache but still saved their results to Browserscope.

There are many possible explanations for the varied crowdsourced results, but reviewing the Browser UIs section shows that knowing how to clear browser data varies significantly across browsers, and often changes from one version of a browser to the next. There’s a need for more consistent UI in this regard. What would a consistent UI look like? The simple iPhone UI (a single button) makes it easier to clear everything, but is that what users want and need? I often want to clear my disk cache, but less frequently choose to clear all my cookies. At a minimum, users need a way to clear their browser data in all browsers, which currently isn’t possible.

 

13 Comments

Cache compressed? or uncompressed?

March 27, 2012 4:05 pm | 10 Comments

My previous blog post, Cache them if you can, suggests that current cache sizes are too small – especially on mobile.

Given this concern about cache size a relevant question is:

If a response is compressed, does the browser save it compressed or uncompressed?

Compression typically reduces responses by 70%. This means that a browser can cache 3x as many compressed responses if they’re saved in their compressed format.

Note that not all responses are compressed. Images make up the largest number of resources but shouldn’t be compressed. On the other hand, HTML documents, scripts, and stylesheets should be compressed and account for 30% of all requests. Being able to save 3x as many of these responses to cache could have a significant impact on cache hit rates.

It’s difficult and time-consuming to determine whether compressed responses are saved in compressed format. I created this Caching Gzip Test page to help determine browser behavior. It has two 200 KB scripts – one is compressed down to ~148 KB and the other is uncompressed. (Note that this file is random strings so the compression savings is only 25% as compared to the typical 70%.) After clearing the cache and loading the test page if the total cache disk size increases ~348 KB it means the browser saves compressed responses as compressed. If the total cache disk size increases ~400 KB it means compressed responses are saved uncompressed.

The challenging part of this experiment is finding where the cache is stored and measuring the response sizes. Firefox, Chrome, and Opera save responses as files and were easy to measure. For IE on Windows I wasn’t able to access the individual cache files (admin permissions?) but was able to measure the sizes based on the properties of the Temporary Internet Files folder. Safari saves all responses in Cache.db. I was able to see the incremental increase by modifying the experiment to be two pages: the compressed response and the uncompressed response. You can see the cache file locations and full details in the Caching Gzip Test Results page.

Here are the results for top desktop browsers:

Browser Compressed responses
cached compressed?
max cache size
Chrome 17 yes 320 MB*
Firefox 11 yes 850 MB*
IE 8 no 50 MB
IE 9 no 250 MB
Safari 5.1.2 no unknown
Opera 11 yes 20 MB

* Chrome and Firefox cache size is a percentage of available disk space. Chrome is capped at 320 MB. I don’t know what Firefox’s cap is; on my laptop with 50 GB free the cache size is 830 MB.

We see that Chrome 17, Firefox 11, and Opera 11 store compressed responses in compressed format, while IE 8&9 and Safari 5 save them uncompressed. IE 8&9 have smaller cache sizes, so the fact that they uncompress responses before caching further reduces the number of responses that can be cached.

What’s the best choice? It’s possible that reading cached responses is faster if they’re already uncompressed. That would be a good next step to explore. I wouldn’t prejudge IE’s choice when it comes to performance on Windows. But it’s clear that saving compressed responses in compressed format increases the number of responses that can be cached, and this increases cache hit rates. What’s even clearer is that browsers don’t agree on the best answer. Should they?

 

10 Comments

Cache them if you can

March 22, 2012 10:41 pm | 24 Comments

“The fastest HTTP request is the one not made.”

I always smile when I hear a web performance speaker say this. I forget who said it first, but I’ve heard it numerous times at conferences and meetups over the past few years. It’s true! Caching is critical for making web pages faster. I’ve written extensively about caching:

Things are getting better – but not quickly enough. The chart below from the HTTP Archive shows that the percentage of resources that are cacheable has increased 10% during the past year (from 42% to 46%). Over that same time the number of requests per page has increased 12% and total transfer size has increased 24% (chart).

Perhaps it’s hard to make progress on caching because the problem doesn’t belong to a single group – responsibility spans website owners, third party content providers, and browser developers. One thing is certain – we have to do a better job when it comes to caching. 

I’ve gathered some compelling statistics over the past few weeks that illuminate problems with caching and point to some next steps. Here are the highlights:

  • 55% of resources don’t specify a max-age value
  • 46% of the resources without any max-age remained unchanged over a 2 week period
  • some of the most popular resources on the Web are only cacheable for an hour or two
  • 40-60% of daily users to your site don’t have your resources in their cache
  • 30% of users have a full cache
  • for users with a full cache, the median time to fill their cache is 4 hours of active browsing

Read on to understand the full story.

My kingdom for a max-age header

Many of the caching articles I’ve written address issues such as size & space limitations, bugs with less common HTTP headers, and outdated purging logic. These are critical areas to focus on. But the basic function of caching hinges on websites specifying caching headers for their resources. This is typically done using max-age in the Cache-Control response header. This example specifies that a response can be read from cache for 1 year:

Cache-Control: max-age=31536000

Since you’re reading this blog post you probably already use max-age, but the following chart from the HTTP Archive shows that 55% of resources don’t specify a max-age value. This translates to 45 of the average website’s 81 resources needing a HTTP request even for repeat visits.

Missing max-age != dynamic

Why do 55% of resources have no caching information? Having looked at caching headers across thousands of websites my first guess is lack of awareness – many website owners simply don’t know about the benefits of caching. An alternative explanation might be that many resources are dynamic (JSON, ads, beacons, etc.) and shouldn’t be cached. Which is the bigger cause – lack of awareness or dynamic resources? Luckily we can quantify the dynamicness of these uncacheable resources using data from the HTTP Archive.

The HTTP Archive analyzes the world’s top ~50K web pages on the 1st and 15th of the month and records the HTTP headers for every resource. Using this history it’s possible to go back in time and quantify how many of today’s resources without any max-age value were identical in previous crawls. The data for the chart above (showing 55% of resources with no max-age) was gathered on Feb 15 2012. The chart below shows the percentage of those uncacheable resources that were identical in the previous crawl on Feb 1 2012. We can go back even further and see how many were identical in both the Feb 1 2012 and the Jan 15 2012 crawls. (The HTTP Archive doesn’t save response bodies so the determination of “identical” is based on the resource having the exact same URL, Last-Modified, ETag, and Content-Length.)

46% of the resources without any max-age remained unchanged over a 2 week period. This works out to 21 resources per page that could have been read from cache without any HTTP request but weren’t. Over a 1 month period 38% are unchanged – 17 resources per page.

This is a significant missed opportunity. Here are some popular websites and the number of resources that were unchanged for 1 month but did not specify max-age:

Recalling that “the fastest HTTP request is the one not made”, this is a lot of unnecessary HTTP traffic. I can’t prove it, but I strongly believe this is not intentional – it’s just a lack of awareness. The chart below reinforces this belief – it shows the percentage of resources (both cacheable and uncacheable) that remain unchanged starting from Feb 15 2012 and going back for one year.

The percentage of resources that are unchanged is nearly the same when looking at all resources as it is for only uncacheable resources: 44% vs. 46% going back 2 weeks and 35% vs. 38% going back 1 month. Given this similarity in “dynamicness” it’s likely that the absence of max-age has nothing to do with the resources themselves and is instead caused by website owners overlooking this best practice.

3rd party content

If a website owner doesn’t make their resources cacheable, they’re just hurting themselves (and their users). But if a 3rd party content provider doesn’t have good caching behavior it impacts all the websites that embed that content. This is both bad a good. It’s bad in that one uncacheable 3rd party resource can impact multiple sites. The good part is that shifting 3rd party content to adopt good caching practices also has a magnified effect.

So how are we doing when it comes to caching 3rd party content? Below is a list of the top 30 most-used resources according to the HTTP Archive. These are the resources that were used the most across the world’s top 50K web pages. The max-age value (in hours) is also shown.

  1. http://www.google-analytics.com/ga.js (2 hours)
  2. http://ssl.gstatic.com/s2/oz/images/stars/po/Publisher/sprite2.png (8760 hours)
  3. http://pagead2.googlesyndication.com/pagead/js/r20120208/r20110914/show_ads_impl.js (336 hours)
  4. http://pagead2.googlesyndication.com/pagead/render_ads.js (336 hours)
  5. http://pagead2.googlesyndication.com/pagead/show_ads.js (1 hour)
  6. https://apis.google.com/_/apps-static/_/js/gapi/gcm_ppb,googleapis_client,plusone/[…] (720 hours)
  7. http://pagead2.googlesyndication.com/pagead/osd.js (24 hours)
  8. http://pagead2.googlesyndication.com/pagead/expansion_embed.js (24 hours)
  9. https://apis.google.com/js/plusone.js (1 hour)
  10. http://googleads.g.doubleclick.net/pagead/drt/s?safe=on (1 hour)
  11. http://static.ak.fbcdn.net/rsrc.php/v1/y7/r/ql9vukDCc4R.png (3825 hours)
  12. http://connect.facebook.net/rsrc.php/v1/yQ/r/f3KaqM7xIBg.swf (164 hours)
  13. https://ssl.gstatic.com/s2/oz/images/stars/po/Publisher/sprite2.png (8760 hours)
  14. https://apis.google.com/_/apps-static/_/js/gapi/googleapis_client,iframes_styles[…] (720 hours)
  15. http://static.ak.fbcdn.net/rsrc.php/v1/yv/r/ZSM9MGjuEiO.js (8742 hours)
  16. http://static.ak.fbcdn.net/rsrc.php/v1/yx/r/qP7Pvs6bhpP.js (8699 hours)
  17. https://plusone.google.com/_/apps-static/_/ss/plusone/[…] (720 hours)
  18. http://b.scorecardresearch.com/beacon.js (336 hours)
  19. http://static.ak.fbcdn.net/rsrc.php/v1/yx/r/lP_Rtwh3P-S.css (8710 hours)
  20. http://static.ak.fbcdn.net/rsrc.php/v1/yA/r/TSn6F7aukNQ.js (8760 hours)
  21. http://static.ak.fbcdn.net/rsrc.php/v1/yk/r/Wm4bpxemaRU.js (8702 hours)
  22. http://static.ak.fbcdn.net/rsrc.php/v1/yZ/r/TtnIy6IhDUq.js (8699 hours)
  23. http://static.ak.fbcdn.net/rsrc.php/v1/yy/r/0wf7ewMoKC2.css (8699 hours)
  24. http://static.ak.fbcdn.net/rsrc.php/v1/yO/r/H0ip1JFN_jB.js (8760 hours)
  25. http://platform.twitter.com/widgets/hub.1329256447.html (87659 hours)
  26. http://static.ak.fbcdn.net/rsrc.php/v1/yv/r/T9SYP2crSuG.png (8699 hours)
  27. http://platform.twitter.com/widgets.js (1 hour)
  28. https://plusone.google.com/_/apps-static/_/js/plusone/[…] (720 hours)
  29. http://pagead2.googlesyndication.com/pagead/js/graphics.js (24 hours)
  30. http://s0.2mdn.net/879366/flashwrite_1_2.js (720 hours)

There are some interesting patterns.

  • simple URLs have short cache times – Some resources have very short cache times, e.g., ga.js (1), show_ads.js (5), and twitter.com/widgets.js (27). Most of the URLs for these resources are very simple (no querystring or URL “fingerprints”) because these resource URLs are part of the snippet that website owners paste into their page. These “bootstrap” resources are given short cache times because there’s no way for the resource URL to be changed if there’s an emergency fix – instead the cached resource has to expire in order for the emergency update to be retrieved.
  • long URLs have long cache times – Many 3rd party “bootstrap” scripts dynamically load other resources. These code-generated URLs are typically long and complicated because they contain some unique fingerprinting, e.g., http://pagead2.googlesyndication.com/pagead/js/r20120208/r20110914/show_ads_impl.js (3) and http://platform.twitter.com/widgets/hub.1329256447.html (25). If there’s an emergency change to one of these resources, the fingerprint in the bootstrap script can be modified so that a new URL is requested. Therefore, these fingerprinted resources can have long cache times because there’s no need to rev them in the case of an emergency fix.
  • where’s Facebook’s like button? – Facebook’s like.php and likebox.php are also hugely popular but aren’t in this list because the URL contains a querystring that differs across every website. Those resources have an even more aggressive expiration policy compared to other bootstrap resources – they use no-cache, no-store, must-revalidate. Once the like[box] bootstrap resource is loaded, it loads the other required resources: lP_Rtwh3P-S.css (19), TSn6F7aukNQ.js (20), etc. Those resources have long URLs and long cache times because they’re generated by code, as explained in the previous bullet.
  • short caching resources are often async – The fact that bootstrap scripts have short cache times is good for getting emergency updates, but is bad for performance because they generate many Conditional GET requests on subsequent requests. We all know that scripts block pages from loading, so these Conditional GET requests can have a significant impact on the user experience. Luckily, some 3rd party content providers are aware of this and offer async snippets for loading these bootstrap scripts mitigating the impact of their short cache times. This is true for ga.js (1), plusone.js (9), twitter.com/widgets.js (27), and Facebook’s like[box].php.

These extremely popular 3rd party snippets are in pretty good shape, but as we get out of the top widgets we quickly find that these good caching patterns degrade. In addition, more 3rd party providers need to support async snippets.

Cache sizes are too small

In January 2007 Tenni Theurer and I ran an experiment at Yahoo! to estimate how many users had a primed cache. The methodology was to embed a transparent 1×1 image in the page with an expiration date in the past. If users had the expired image in their cache the browser would issue a Conditional GET request and receive a 304 response (primed cache). Otherwise they’d get a 200 response (empty cache). I was surprised to see that 40-60% of daily users to the site didn’t have the site’s resources in their cache and 20% of page views were done without the site’s resources in the cache.

Numerous factors contribute to this high rate of unique users missing the site’s resources in their cache, but I believe the primary reason is small cache sizes. Browsers have increased the size of their caches since this experiment was run, but not enough. It’s hard to test browser cache size. Blaze.io’s article Understanding Mobile Cache Sizes shows results from their testing. Here are the max cache sizes I found for browsers on my MacBook Air. (Some browsers set the cache size based on available disk space, so let me mention that my drive is 250 GB and has 54 GB available.) I did some testing and searching to find max cache sizes for my mobile devices and IE.

  • Chrome: 320 MB
  • Internet Explorer 9: 250 MB
  • Firefox 11: 830 MB (shown in about:cache)
  • Opera 11: 20 MB (shown in Preferences | Advanced | History)
  • iPhone 4, iOS 5.1: 30-35 MB (based on testing)
  • Galaxy Nexus: 18 MB (based on testing)

I’m surprised that Firefox 11 has such a large cache size – that’s almost close to what I want. All the others are (way) too small. 18-35 MB on my mobile devices?! I have seven movies on my iPhone – I’d gladly trade Iron Man 2  (1.82 GB) for more cache space.

Caching in the real world

In order to justify increasing browser cache sizes we need some statistics on how many real users overflow their cache. This topic came up at last month’s Velocity Summit where we had representatives from Chrome, Internet Explorer, Firefox, Opera, and Silk. (Safari was invited but didn’t show up.) Will Chan from the Chrome team (working on SPDY) followed-up with this post on Chromium cache metrics from Windows Chrome. These are the most informative real user cache statistics I’ve ever seen. I strongly encourage you to read his article.

Some of the takeaways include:

  • ~30% of users have a full cache (capped at 320 MB)
  • for users with a full cache, the median time to fill their cache is 4 hours of active browsing (20 hours of clock time)
  • 7% of users clear their cache at least once per week
  • 19% of users experience “fatal cache corruption” at least once per week thus clearing their cache

The last stat about cache corruption is interesting – I appreciate the honesty. The IE 9 team experienced something similar. In IE 7&8 the cache was capped at 50 MB based on tests showing increasing the cache size didn’t improve the cache hit rate. They revisited this surprising result in IE9 and found that larger cache sizes actually did improve the cache hit rate:

In IE9, we took a much closer look at our cache behaviors to better understand our surprising finding that larger caches were rarely improving our hit rate. We found a number of functional problems related to what IE treats as cacheable and how the cache cleanup algorithm works. After fixing these issues, we found larger cache sizes were again resulting in better hit rates, and as a result, we’ve changed our default cache size algorithm to provide a larger default cache.

Will mentions that Chrome’s 320 MB cap should be revisited. 30% seems like a low percentage for full caches, but could be accounted for by users that aren’t very active and active users that only visit a small number of websites (for example, just Gmail and Facebook). If possible I’d like to see these full cache statistics correlated with activity. It’s likely that user who account for the biggest percentage of web visits are more likely to have a full cache, and thus experience slower page load times.

Next steps

First, much of the data for this post came from the HTTP Archive, so I’d like to thank our sponsors: Google, Mozilla, New Relic, O’Reilly Media, Etsy, Strangeloop, dynaTrace Software, and Torbit.

The data presented here suggest a few areas to focus on:

Website owners need to increase their use of a Cache-Control max-age, and the max-age times need to be longer. 38% of resources were unchanged over a 1 month period, and yet only 11% of resources have a max-age value that high. Most resources, even if they change, can be refreshed by including a fingerprint in the URL specified in the HTML document. Only bootstrap scripts from 3rd parties should have short cache times (hours). Truly dynamic responses (JSON, etc.) should specify must-revalidate. A year from now rather than seeing 55% of resources without any max-age value we should see 55% cacheable for a month or more.

3rd party content providers need wider adoption of the caching and async behavior shown by the top Google, Twitter, and Facebook snippets.

Browser developers stand to bring the biggest improvements to caching. Increasing cache sizes is a likely win, especially for mobile devices. Data correlating cache sizes and user activity is needed. More intelligence around purging algorithms, such as IE 9’s prioritization based on mime type, will help when the cache fills up. More focus on personalization (what are the sites I visit most often?) would also create a faster user experience when users go to their favorite websites.

It’s great that the number of resources with caching headers grew 10% over the last year, but that just isn’t enough progress. We should really expect to double the number of resources that can be read from cache over the coming year. Just think about all those HTTP requests that can be avoided!

 

24 Comments

Frontend SPOF

June 1, 2010 7:49 pm | 9 Comments

My evangelism of high performance web sites started off in the context of quality code and development best practices. It’s easy for a style of coding to permeate throughout a company. Developers switch teams. Code is copied and pasted (especially in the world of web development). If everyone is developing in a high performance way, that’s the style that will characterize how the company codes.

This argument of promoting development best practices gained traction in the engineering quarters of the companies I talked to, but performance improvements continued to get backburnered in favor of new features and content that appealed to the business side of the organization. Improving performance wasn’t considered as important as other changes. Everyone assumed users wanted new features and that’s what got the most attention.

It became clear to me that we needed to show a business case for web performance. That’s why the theme for Velocity 2009 was “the impact of performance on the bottom line”. Since then there have been numerous studies released that have shown that improving performance does improve the bottom line. As a result, I’m seeing the business side of many web companies becoming strong advocates for Web Performance Optimization.

But there are still occasions when I have a hard time convincing a team that focusing on web performance, specifically frontend performance, is important. Shaving off hundreds (or even thousands) of milliseconds just doesn’t seem worthwhile to them. That’s when I pull out the big guns and explain that loading scripts and stylesheets in the typical way creates a frontend single point of failure that can bring down the entire site.

Examples of Frontend SPOF

The thought that simply adding a script or stylesheet to your web page could make the entire site unavailable surprises many people. Rather than focusing on CSS mistakes and JavaScript errors, the key is to think about what happens when a resource request times out. With this clue, it’s easy to create a test case:

<html>
<head>
<script src="http://www.snippet.com/main.js" type="text/javascript">
  </script>
</head>
<body>
Here's my page!
</body>
</html>

This HTML page looks pretty normal, but if snippet.com is overloaded the entire page is blank waiting for main.js to return. This is true in all browsers.

Here are some examples of frontend single points of failure and the browsers they impact. You can click on the Frontend SPOF test links to see the actual test page.

Frontend SPOF test Chrome Firefox IE Opera Safari
External Script blank below blank below blank below blank below blank below
Stylesheet flash flash blank below flash blank below
inlined @font-face delayed flash flash flash delayed
Stylesheet with @font-face delayed flash totally blank* flash delayed
Script then @font-face delayed flash totally blank* flash delayed

* Internet Explorer 9 does not display a blank page, but does “flash” the element.

The failure cases are highlighted in red. Here are the four possible outcomes sorted from worst to best:

  • totally blank – Nothing in the page is rendered – the entire page is blank.
  • blank below – All the DOM elements below the resource in question are not rendered.
  • delayed – Text that uses the @font-face style is invisible until the font file arrives.
  • flash – DOM elements are rendered immediately, and then redrawn if necessary after the stylesheet or font has finished downloading.

Web Performance avoids SPOF

It turns out that there are web performance best practices that, in addition to making your pages faster, also avoid most of these frontend single points of failure. Let’s look at the tests one by one.

External Script 
All browsers block rendering of elements below an external script until the script arrives and is parsed and executed. Since many sites put scripts in the HEAD, this means the entire page is typically blank. That’s why I believe the most important web performance coding pattern for today’s web sites is to load JavaScript asynchronously. Not only does this improve performance, but it avoids making external scripts a possible SPOF. 
Stylesheet 
Browsers are split on how they handle stylesheets. Firefox and Opera charge ahead and render the page, and then flash the user if elements have to be redrawn because their styling changed. Chrome, Internet Explorer, and Safari delay rendering the page until the stylesheets have arrived. (Generally they only delay rendering elements below the stylesheet, but in some cases IE will delay rendering everything in the page.) If rendering is blocked and the stylesheet takes a long time to download, or times out, the user is left staring at a blank page. There’s not a lot of advice on loading stylesheets without blocking page rendering, primarily because it would introduce the flash of unstyled content.
inlined @font-face 
I’ve blogged before about the performance implications of using @font-face. When the @font-face style is declared in a STYLE block in the HTML document, the SPOF issues are dramatically reduced. Firefox, Internet Explorer, and Opera avoid making these custom font files a SPOF by rendering the affected text and then redrawing it after the font file arrives. Chrome and Safari don’t render the customized text at all until the font file arrives. I’ve drawn these cells in yellow since it could cause the page to be unusable for users using these browsers, but most sites only use custom fonts on a subset of the page.
Stylesheet with @font-face 
Inlining your @font-face style is the key to avoiding having font files be a single point of failure. If you inline your @font-face styles and the font file takes forever to return or times out, the worst case is the affected text is invisible in Chrome and Safari. But at least the rest of the page is visible, and everything is visible in Firefox, IE, and Opera. Moving the @font-face style to a stylesheet not only slows down your site (by requiring two sequential downloads to render text), but it also creates a special case in Internet Explorer 7 & 8 where the entire page is blocked from rendering. IE 6 is only slightly better – the elements below the stylesheet are blocked from rendering (but if your stylesheet is in the HEAD this is the same outcome).
Script then @font-face 
Inlining your @font-face style isn’t enough to avoid the entire page SPOF that occurs in IE. You also have to make sure the inline STYLE block isn’t preceded by a SCRIPT tag. Otherwise, your entire page is blank in IE waiting for the font file to arrive. If that file is slow to return, your users are left staring at a blank page.

SPOF is bad

Five years ago most of the attention on web performance was focused on the backend. Since then we’ve learned that 80% of the time users wait for a web page to load is the responsibility of the frontend. I feel this same bias when it comes to identifying and guarding against single points of failure that can bring down a web site – the focus is on the backend and there’s not enough focus on the frontend. For larger web sites, the days of a single server, single router, single data center, and other backend SPOFs are way behind us. And yet, most major web sites include scripts and stylesheets in the typical way that creates a frontend SPOF. Even more worrisome – many of these scripts are from third parties for social widgets, web analytics, and ads.

Look at the scripts, stylesheets, and font files in your web page from a worst case scenario perspective. Ask yourself:

  • Is your web site’s availability dependent on these resources?
  • Is it possible that if one of these resources timed out, users would be blocked from seeing your site?
  • Are any of these single point of failure resources from a third party?
  • Would you rather embed resources in a way that avoids making them a frontend SPOF?

Make sure you’re aware of your frontend SPOFs, track their availability and latency closely, and embed them in your page in a non-blocking way whenever possible.

Update Oct 12: Pat Meenan created a blackhole server that you can use to detect frontend SPOF in webpages.

9 Comments

Call to improve browser caching

April 26, 2010 9:14 pm | 38 Comments

Over Christmas break I wrote Santa my browser wishlist. There was one item I neglected to ask for: improvements to the browser disk cache.

In 2007 Tenni Theurer and I ran an experiment to measure browser cache stats from the server side. Tenni’s write up, Browser Cache Usage – Exposed, is the stuff of legend. There she reveals that while 80% of page views were done with a primed cache, 40-60% of unique users hit the site with an empty cache at least once per day. 40-60% seems high, but I’ve heard similar numbers from respected web devs at other major sites.

Why do so many users have an empty cache at least once per day?

I’ve been racking my brain for years trying to answer this question. Here are some answers I’ve come up with:

  • first time users – Yea, but not 40-60%.
  • cleared cache – It’s true: more and more people are likely using anti-virus software that clears the cache between browser sessions. And since we ran that experiment back in 2007 many browsers have added options for clearing the cache frequently (for example, Firefox’s privacy.clearOnShutdown.cache option). But again, this doesn’t account for the 40-60% number.
  • flawed experiment – It turns out there was a flaw in the experiment (browsers ignore caching headers when an image is in memory), but this would only affect the 80% number, not the 40-60% number. And I expect the impact on the 80% number is small, given the fact that other folks have gotten similar numbers. (In a future blog post I’ll share a new experiment design I’ve been working on.)
  • resources got evicted – hmmmmm

OK, let’s talk about eviction for a minute. The two biggest influencers for a resource getting evicted are the size of the cache and the eviction algorithm. It turns out, the amount of disk space used for caching hasn’t kept pace with the size of people’s drives and their use of the Web. Here are the default disk cache sizes for the major browsers:

  • Internet Explorer: 8-50 MB
  • Firefox: 50 MB
  • Safari: everything I found said there isn’t a max size setting (???)
  • Chrome: < 80 MB (varies depending on available disk space)
  • Opera: 20 MB

Those defaults are too small. My disk drive is 150 GB of which 120 GB is free. I’d gladly give up 5 GB or more to raise the odds of web pages loading faster.

Even with more disk space, the cache is eventually going to fill up. When that happens, cached resources need to be evicted to make room for the new ones. Here’s where eviction algorithms come into play. Most eviction algorithms are LRU-based – the resource that was least recently used is evicted. However, our knowledge of performance pain points has grown dramatically in the last few years. Translating this knowledge into eviction algorithm improvements makes sense. For example, we’re all aware how much costlier it is to download a script than an image. (Scripts block other downloads and rendering.) Scripts, therefore, should be given a higher priority when it comes to caching.

It’s hard to get access to gather browser disk cache stats, so I’m asking people to discover their own settings and share them via the Browser Disk Cache Survey form. I included this in my talks at JSConf and jQueryConf. ~150 folks at those conferences filled out the form. The data shows that 55% of people surveyed have a cache that’s over 90% full. (Caveats: this is a small sample size and the data is self-reported.) It would be great if you would take time to fill out the form. I’ve also started writing instructions for finding your cache settings.

I’m optimistic about the potential speedup that could result from improving browser caching, and fortunately browser vendors seem receptive (for example, the recent Mozilla Caching Summit). I expect we’ll see better default cache sizes and eviction logic in the next major release of each browser. Until then, jack up your defaults as described in the instructions. And please add comments for any browsers I left out or got wrong. Thanks.

38 Comments

Browser Performance Wishlist

February 15, 2010 4:25 pm | 28 Comments

What are the most important changes browsers could make to improve performance?

This document is my answer to that question. This is mainly for browser developers, although web developers will want to track the adoption of these improvements.

Before digging into the list I wanted to mention two items that would actually be at the top of the list if it wasn’t for how new they are: SPDY and FRAG tag. Both of these require industry adoption and possible changes to specifications, so it’s too soon to put them on an implementation wishlist. I hope these ideas gain consensus soon and to facilitate that I describe them here.

SPDY
SPDY is a proposal from Google for making three major improvements to HTTP: compressed headers, multiplexed requests, and prioritized responses. Initial studies showed 25 top sites were loaded 55% faster. Server and client implementations are available, and some other organizations and individuals have completed server and client implementations. The protocol draft has been published for review.
FRAG tag
The idea behind this “document fragment” tag is that it be used to wrap 3rd party content – ads, widgets, and analytics. 3rd party content can have a severe impact on the containing page’s performance due to additional HTTP requests, scripts that block rendering and downloads, and added DOM nodes. Many of these factors can be mitigated by putting the 3rd party content inside an iframe embedded in the top level HTML document. But iframes have constraints and drawbacks – they typically introduce another HTTP request for the iframe’s HTML document, not all 3rd party code snippets will work inside an iframe without changes (e.g., references to “document” in JavaScript might need to reference the parent document), and some snippets (expando ads, suggest) can’t float over the main page’s elements. Another path to mitigate these issues is to load the JavaScript asynchronously, but many of these widgets use document.write and so must be evaluated synchronously.A compromise is to place 3rd party content in the top level HTML document wrapped in a FRAG block. This approach degrades nicely – older browsers would ignore the FRAG tag and handle these snippets the same way they do today. Newer browsers would parse the HTML in a separate document fragment. The FRAG content would not block the rendering of the top level document. Snippets containing document.write would work without blocking the top level document. This idea just started getting discussed in January 2010. Much more use case analysis and discussion is needed, culminating in a proposed specification. (Credit to Alex Russell for the idea and name.)

The List

The performance wishlist items are sorted highest priority first. The browser icons indicate which browsers need to implement that particular improvement.

download scripts without blocking
In older browsers, once a script started downloading all subsequent downloads were blocked until the script returned. It’s critical that scripts be evaluated in the order specified, but they can be downloaded in parallel. This has a significant improvement on page load times, especially for pages with multiple scripts. Newer browsers (IE8, Firefox 3.5+, Safari 4, Chrome 2+) incorporated this parallel script loading feature, but it doesn’t work as proactively as it could. Specifically:

  • IE8 – downloading scripts blocks image and iframe downloads
  • Firefox 3.6 – downloading scripts blocks iframe downloads
  • Safari 4 – downloading scripts blocks iframe downloads
  • Chrome 4 – downloading scripts blocks iframe downloads
  • Opera 10.10 – downloading scripts blocks all downloads

(test case, see the four “|| Script [Script|Stylesheet|Image|Iframe]” tests)

SCRIPT attributes
The HTML5 specification describes the ASYNC and DEFER attributesfor the SCRIPT tag, but the implementation behavior is not specified. Here’s how the SCRIPT attributes should work.

  • DEFER – The HTTP request for a SCRIPT with the DEFER attribute is not made until all other resources in the page on the same domain have already been sent. This is so that it doesn’t occupy one of the limited number of connections that are opened for a single server. Deferred scripts are downloaded in parallel, but are executed in the order they occur in the HTML document, regardless of what order the responses arrive in. The window’s onload event fires after all deferred scripts are downloaded and executed.
  • ASYNC – The HTTP request for a SCRIPT with the ASYNC attribute is made immediately. Async scripts are executed as soon as the response is received, regardless of the order they occur in the HTML document. The window’s onload event fires after all async scripts are downloaded and executed.
  • POSTONLOAD – This is a new attribute I’m proposing. Postonload scripts don’t start downloading until after the window’s onload event has fired. By default, postonload scripts are evaluated in the order they occur in the HTML document. POSTONLOAD and ASYNC can be used in combination to cause postonload scripts to be evaluated as soon as the response is received, regardless of the order they occur in the HTML document.

resource packages
Each HTTP request has some overhead cost. Workarounds include concatenating scripts, concatenating stylesheets, and creating image sprites. But this still results in multiple HTTP requests. And sprites are especially difficult to create and maintain. Alexander Limi (Mozilla) has proposed using zip files to create resource packages. It’s a good idea because of its simplicity and graceful degradation.
border-radius
Creating rounded corners leads to code bloat and excessive HTTP requests. Border-radius reduces this to a simple CSS style. The only major browser that doesn’t support border-radius is IE. It has already been announced that IE9 will support border-radius, but I wanted to include it nevertheless.
cache redirects
Redirects are costly from a performance perspective, especially for users with high latency. Although the HTTP specsays 301 and 302 responses (with the proper HTTP headers) are cacheable, most browsers don’t support this.

  • IE8 – doesn’t cache redirects for the main page and for resources
  • Safari 4 – doesn’t cache redirects for the main page
  • Opera 10.10 – doesn’t cache redirects for the main page

(test case)

link prefetch
To improve page load times, developers prefetch resources that are likely or certain to be used later in the user’s session. This typically involves writing JavaScript code that executes after the onload event. When prefetching scripts and stylesheets, an iframe must be used to avoid conflict with the JavaScript and CSS in the main page. Using an iframe makes this prefetching code more complex. A final burden is the processing required to parse prefetched scripts and stylesheets. The browser UI can freeze while prefetched scripts and stylesheets are parsed, even though this is unnecessary as they’re not going to be used in the current page. A simple alternative solution is to use LINK PREFETCH. Firefox is the only major browser that supports this feature (since 1.0). Wider support of LINK PREFETCH would give developers an easy way to accelerate their web pages. (test case)
Web Timing spec
In order for web developers to improve the performance of their web sites, they need to be able to measure their performance – specifically their page load times. There’s debate on the endpoint for measuring page load times (window onload event, first paint event, onDomReady), but most people agree that the starting point is when the web page is requested by the user. And yet, there is no reliable way for the owner of the web page to measure from this starting point. Google has submitted the Web Timing proposal draft for browser builtin support for measuring page load times to address these issues.
remote JS debugging
Developers strive to make their web apps fast across all major browsers, but this requires installing and learning a different toolset for each browser. In order to get cross-browser web development tools, browsers need to support remote JavaScript debugging. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. Agreement on the preferred protocol and support in the major browsers would go a long way to getting faster web apps for all users, and reducing the work for developers to maintain cross-browser web app performance.
Web Sockets
HTML5 Web Sockets provide built-in support for two-way communications between the client and server. The communication channel is accessible via JavaScript. Web Sockets are superior to comet and Ajax, especially in their compatibility with proxies and firewalls, and provide a path for building web apps with a high degree of communication between the browser and server.
History
HTML5 specifies implementation for History.pushState and History.replaceState. With these, web developers can dynamically change the URL to reflect the web application state without having to perform a page transition. This is important for Web 2.0 applications that modify the state of the web page using Ajax. Being able to avoid fetching a new HTML document to reflect these application changes results in a faster user experience.
anchor ping
The ping attribute for anchors provides a more performant way to track links. This is a controversial feature because of the association with “tracking” users. However, links are tracked today, it’s just done in a way that hurts the user experience. For example, redirects, synchronous XHR, and tight loops in unload handlers are some of the techniques used to ensure clicks are properly recorded. All of these create a slower user experience.
progressive XHR
The draft spec for XMLHttpRequest details how XHRs are to support progressive response handling. This is important for web apps that use data with varied response times as well as comet-style applications. (more information)
stylesheet & inline JS
When a stylesheet is followed by an inline script, resources that follow are blocked until the stylesheet is downloaded and the inline script is evaluated. Browsers should instead lookahead in their parsing and start downloading subsequent resources in parallel with the stylesheet. These resources of course would not be rendered, parsed, or evaluated until after the stylesheet was parsed and the inline script was evaluated. (test case see “|| CSS + Inline Script”; looks like this just landed in Firefox 3.6!)
SCRIPT DEFER for inline scripts
The benefit of the SCRIPT DEFER attribute for external scripts is discussed above. But DEFER is also useful for inline scripts that can be executed after the page has been parsed. Currently, IE8 supports this behavior. (test case)
@import improvements
@import is a popular alternative to the LINK tag for loading stylesheets, but it has several performance problems in IE:

  • LINK @import – If the first stylesheet is loaded using LINK and the second one uses @import, they are loaded sequentially instead of in parallel. (test case)
  • LINK blocks @import – If the first stylesheet is loaded using LINK, and the second stylesheet is loaded using LINK that contains @import, that @import stylesheet is blocked from downloading until the first stylesheet response is received. It would be better to start downloading the @import stylesheet immediately. (test case)
  • many @imports – Using @import can change the download sequence of resources. In this test case, multiple stylesheets loaded with @import are followed by a script. Even though the script is listed last in the HTML document, it gets downloaded first. If the script takes a long time to download, it can causes the stylesheet downloads to be delayed, which can cause rendering to be delayed. It would be better to follow the order specified in the HTML document. (test case)

(more information)

@font-face improvements
In IE8, if a script occurs before a style that uses @font-face, the page is blocked from rendering until the font file is done downloading. It would be better to render the rest of the page without waiting for the font file. (test case, blog post)
stylesheets & iframes
When an iframe is preceded by an external stylesheet, it blocks iframe downloads. In IE, the iframe is blocked from downloading until the stylesheet response is received. In Firefox, the iframe’s resources are blocked from downloading until the stylesheet response is received. There’s no dependency between the parent’s stylesheet and the iframe’s HTML document, so this blocking behavior should be removed. (test case)
paint events
As the amount of DOM elements and CSS grows, it’s becoming more important to be able to measure the performance of painting the page. Firefox 3.5 added the MozAfterPaint event which opened the door for add-ons like Firebug Paint Events (although early Firefox documentation noted that the “event might fire before the actual repainting happens“). Support for accurate paint events will allow developers to capture these metrics.
missing schema, double downloads
In IE7&8, if the “http:” schema is missing from a stylesheet’s URL, the stylesheet is downloaded twice. This makes the page render more slowly. Not including “http://” in URLs is not pervasive, but it’s getting more widely adopted because it reduces download size and resolves to “http://” or “https://” as appropriate. (test case)

28 Comments

5e speculative background images

February 12, 2010 6:09 pm | 13 Comments

This is the fifth of five quick posts about some browser quirks that have come up in the last few weeks.

Chrome and Safari start downloading background images before all styles are available. If a background image style gets overwritten this may cause wasteful downloads.

Background images are used everywhere: buttons, background wallpaper, rounded corners, etc. You specify a background image in CSS like so:

.bgimage { background-image: url("/images/button1.gif"); }

Downloading resources is an area for optimizing performance, so it’s important to understand what causes CSS background images to get downloaded. See if you can answer the following questions about button1.gif:

  1. Suppose no elements in the page use the class “bgimage”. Is button1.gif downloaded?
  2. Suppose an element in the page has the class “bgimage” but also has “display: none” or “visibility: hidden”. Is button1.gif downloaded?
  3. Suppose later in the page a stylesheet gets downloaded and redefines the “bgimage” class like this:
    .bgimage { background-image: url("/images/button2.gif"); }

    Is button1.gif downloaded?

Ready?

The answer to question #1 is “no”. If no elements in the page use the rule, then the background image is not downloaded. This is true in all browsers that I’ve tested.

The answer to question #2 is “depends on the browser”. This might be surprising. Firefox 3.6 and Opera 10.10 do not download button1.gif, but the background image is downloaded in IE 8, Safari 4, and Chrome 4. I don’t have an explanation for this, but I do have a test page: hidden background images. If you have elements with background images that are hidden initially, you should hold off on creating them until after the visible content in the page is rendered.

The answer to question #3 is “depends on the browser”. I find this to be the most interesting behavior to investigate. According to the cascading behavior of CSS, the latter definition of the “bgimage” class should cause the background-image style to use button2.gif. And in all the major browsers this is exactly what happens. But Safari 4 and Chrome 4 are a little more aggressive about fetching background images. They download button1.gif on the speculation that the background-image property won’t be overwritten, and then later download button2.gif when it is overwritten. Here’s the test page: speculative background images.

When my officemate, Steve Lamm, pointed out this behavior to me, my first reaction was “that’s wasteful!” I love prefetching, but I’m not a big fan of most prefetching implementations because they’re too aggressive – they err too far on the side of downloading resources that never get used. After my initial reaction, I thought about this some more. How frequently would this speculative background image downloading be wasteful? I went on a search and couldn’t find any popular web site that overwrote the background-image style. Not one. I’m not saying pages like this don’t exist, I’m just saying it’s very atypical.

On the other hand, this speculative downloading of background images can really help performance and the user’s perception of page speed. Many web sites have multiple stylesheets. If background images don’t start downloading until all stylesheets are done loading, the page takes longer to render. Safari and Chrome’s behavior of downloading a background image as soon as an element needs it, even if one or more stylesheets are still downloading, is a nice performance optimization.

That’s a nice way to finish the week. Next week: my Browser Performance Wishlist.

The five posts in this series are:

13 Comments

5b document.write scripts block in Firefox

February 10, 2010 5:58 pm | 9 Comments

This is the second of five quick posts about some browser quirks that have come up in the last few weeks.

Scripts loaded using document.write block other downloads in Firefox.

Unfortunately, document.write was invented. That problem was made a bzillion times worse when ads decided to use document.write to insert scripts into the content publisher’s page. It’s one line of code:

document.write('<script src="http://www.adnetwork.com/main.js"><\/script>');

Fortunately, most of today’s newer browsers load scripts in parallel including scripts added via document.write. But a few weeks ago I noticed that Firefox 3.6 had some weird blocking behavior in a page with ads, and tracked it down to a script added using document.write.

The document.write scripts test page demonstrates the problem. It has four scripts. The first and second are inserted using document.write. The third and fourth are loaded the normal way (via HTML using SCRIPT SRC). All four scripts are configured to take 4 seconds to download. In IE8, Chrome 4, Safari 4, and Opera 10.10, the total page load time is ~4 seconds. All the scripts, even the ones inserted using document.write, are loaded in parallel. In Firefox, the total page load time is 12 seconds (tested on 2.0, 3.0, and 3.6). The first document.write script loads from 1-4 seconds, the second document.write scripts loads from 5-8 seconds, and the final two normal scripts are loaded in parallel from 9-12 seconds.

The issues with document.write are getting more well known. Some 3rd party code snippets (including Google Analytics) are switching away from document.write. But most 3rd party snippets still use document.write to insert their code into the publisher’s page. Here’s one more reason to avoid document.write.

The five posts in this series are:

9 Comments