Call to improve browser caching
Over Christmas break I wrote Santa my browser wishlist. There was one item I neglected to ask for: improvements to the browser disk cache.
In 2007 Tenni Theurer and I ran an experiment to measure browser cache stats from the server side. Tenni’s write up, Browser Cache Usage – Exposed, is the stuff of legend. There she reveals that while 80% of page views were done with a primed cache, 40-60% of unique users hit the site with an empty cache at least once per day. 40-60% seems high, but I’ve heard similar numbers from respected web devs at other major sites.
Why do so many users have an empty cache at least once per day?
I’ve been racking my brain for years trying to answer this question. Here are some answers I’ve come up with:
- first time users – Yea, but not 40-60%.
- cleared cache – It’s true: more and more people are likely using anti-virus software that clears the cache between browser sessions. And since we ran that experiment back in 2007 many browsers have added options for clearing the cache frequently (for example, Firefox’s privacy.clearOnShutdown.cache option). But again, this doesn’t account for the 40-60% number.
- flawed experiment – It turns out there was a flaw in the experiment (browsers ignore caching headers when an image is in memory), but this would only affect the 80% number, not the 40-60% number. And I expect the impact on the 80% number is small, given the fact that other folks have gotten similar numbers. (In a future blog post I’ll share a new experiment design I’ve been working on.)
- resources got evicted – hmmmmm
OK, let’s talk about eviction for a minute. The two biggest influencers for a resource getting evicted are the size of the cache and the eviction algorithm. It turns out, the amount of disk space used for caching hasn’t kept pace with the size of people’s drives and their use of the Web. Here are the default disk cache sizes for the major browsers:
- Internet Explorer: 8-50 MB
- Firefox: 50 MB
- Safari: everything I found said there isn’t a max size setting (???)
- Chrome: < 80 MB (varies depending on available disk space)
- Opera: 20 MB
Those defaults are too small. My disk drive is 150 GB of which 120 GB is free. I’d gladly give up 5 GB or more to raise the odds of web pages loading faster.
Even with more disk space, the cache is eventually going to fill up. When that happens, cached resources need to be evicted to make room for the new ones. Here’s where eviction algorithms come into play. Most eviction algorithms are LRU-based – the resource that was least recently used is evicted. However, our knowledge of performance pain points has grown dramatically in the last few years. Translating this knowledge into eviction algorithm improvements makes sense. For example, we’re all aware how much costlier it is to download a script than an image. (Scripts block other downloads and rendering.) Scripts, therefore, should be given a higher priority when it comes to caching.
It’s hard to get access to gather browser disk cache stats, so I’m asking people to discover their own settings and share them via the Browser Disk Cache Survey form. I included this in my talks at JSConf and jQueryConf. ~150 folks at those conferences filled out the form. The data shows that 55% of people surveyed have a cache that’s over 90% full. (Caveats: this is a small sample size and the data is self-reported.) It would be great if you would take time to fill out the form. I’ve also started writing instructions for finding your cache settings.
I’m optimistic about the potential speedup that could result from improving browser caching, and fortunately browser vendors seem receptive (for example, the recent Mozilla Caching Summit). I expect we’ll see better default cache sizes and eviction logic in the next major release of each browser. Until then, jack up your defaults as described in the instructions. And please add comments for any browsers I left out or got wrong. Thanks.
Browser Performance Wishlist
What are the most important changes browsers could make to improve performance?
This document is my answer to that question. This is mainly for browser developers, although web developers will want to track the adoption of these improvements.
- download scripts without blocking
- SCRIPT attributes
- resource packages
- border-radius
- cache redirects
- link prefetch
- Web Timing spec
- remote JS debugging
- Web Sockets
- History
- anchor ping
- progressive XHR
- stylesheet & inline JS
- SCRIPT DEFER for inline scripts
- @import improvements
- @font-face improvements
- stylesheets & iframes
- paint events
- missing schema, double downloads
Before digging into the list I wanted to mention two items that would actually be at the top of the list if it wasn’t for how new they are: SPDY and FRAG tag. Both of these require industry adoption and possible changes to specifications, so it’s too soon to put them on an implementation wishlist. I hope these ideas gain consensus soon and to facilitate that I describe them here.
- SPDY
- SPDY is a proposal from Google for making three major improvements to HTTP: compressed headers, multiplexed requests, and prioritized responses. Initial studies showed 25 top sites were loaded 55% faster. Server and client implementations are available, and some other organizations and individuals have completed server and client implementations. The protocol draft has been published for review.
- FRAG tag
- The idea behind this “document fragment” tag is that it be used to wrap 3rd party content – ads, widgets, and analytics. 3rd party content can have a severe impact on the containing page’s performance due to additional HTTP requests, scripts that block rendering and downloads, and added DOM nodes. Many of these factors can be mitigated by putting the 3rd party content inside an iframe embedded in the top level HTML document. But iframes have constraints and drawbacks – they typically introduce another HTTP request for the iframe’s HTML document, not all 3rd party code snippets will work inside an iframe without changes (e.g., references to “document” in JavaScript might need to reference the parent document), and some snippets (expando ads, suggest) can’t float over the main page’s elements. Another path to mitigate these issues is to load the JavaScript asynchronously, but many of these widgets use document.write and so must be evaluated synchronously.
A compromise is to place 3rd party content in the top level HTML document wrapped in a FRAG block. This approach degrades nicely – older browsers would ignore the FRAG tag and handle these snippets the same way they do today. Newer browsers would parse the HTML in a separate document fragment. The FRAG content would not block the rendering of the top level document. Snippets containing document.write would work without blocking the top level document. This idea just started getting discussed in January 2010. Much more use case analysis and discussion is needed, culminating in a proposed specification. (Credit to Alex Russell for the idea and name.)
The List
The performance wishlist items are sorted highest priority first. The browser icons indicate which browsers need to implement that particular improvement.
- download scripts without blocking
- In older browsers, once a script started downloading all subsequent downloads were blocked until the script returned. It’s critical that scripts be evaluated in the order specified, but they can be downloaded in parallel. This has a significant improvement on page load times, especially for pages with multiple scripts. Newer browsers (IE8, Firefox 3.5+, Safari 4, Chrome 2+) incorporated this parallel script loading feature, but it doesn’t work as proactively as it could. Specifically:
- IE8 – downloading scripts blocks image and iframe downloads
- Firefox 3.6 – downloading scripts blocks iframe downloads
- Safari 4 – downloading scripts blocks iframe downloads
- Chrome 4 – downloading scripts blocks iframe downloads
- Opera 10.10 – downloading scripts blocks all downloads
(test case, see the four “|| Script [Script|Stylesheet|Image|Iframe]” tests)
- SCRIPT attributes
- The HTML5 specification describes the ASYNC and DEFER attributes for the SCRIPT tag, but the implementation behavior is not specified. Here’s how the SCRIPT attributes should work.
- DEFER – The HTTP request for a SCRIPT with the DEFER attribute is not made until all other resources in the page on the same domain have already been sent. This is so that it doesn’t occupy one of the limited number of connections that are opened for a single server. Deferred scripts are downloaded in parallel, but are executed in the order they occur in the HTML document, regardless of what order the responses arrive in. The window’s onload event fires after all deferred scripts are downloaded and executed.
- ASYNC – The HTTP request for a SCRIPT with the ASYNC attribute is made immediately. Async scripts are executed as soon as the response is received, regardless of the order they occur in the HTML document. The window’s onload event fires after all async scripts are downloaded and executed.
- POSTONLOAD – This is a new attribute I’m proposing. Postonload scripts don’t start downloading until after the window’s onload event has fired. By default, postonload scripts are evaluated in the order they occur in the HTML document. POSTONLOAD and ASYNC can be used in combination to cause postonload scripts to be evaluated as soon as the response is received, regardless of the order they occur in the HTML document.
- resource packages
- Each HTTP request has some overhead cost. Workarounds include concatenating scripts, concatenating stylesheets, and creating image sprites. But this still results in multiple HTTP requests. And sprites are especially difficult to create and maintain. Alexander Limi (Mozilla) has proposed using zip files to create resource packages. It’s a good idea because of its simplicity and graceful degradation.
- border-radius
- Creating rounded corners leads to code bloat and excessive HTTP requests. Border-radius reduces this to a simple CSS style. The only major browser that doesn’t support border-radius is IE. It has already been announced that IE9 will support border-radius, but I wanted to include it nevertheless.
- cache redirects
- Redirects are costly from a performance perspective, especially for users with high latency. Although the HTTP spec says 301 and 302 responses (with the proper HTTP headers) are cacheable, most browsers don’t support this.
- IE8 – doesn’t cache redirects for the main page and for resources
- Safari 4 – doesn’t cache redirects for the main page
- Opera 10.10 – doesn’t cache redirects for the main page
- link prefetch
- To improve page load times, developers prefetch resources that are likely or certain to be used later in the user’s session. This typically involves writing JavaScript code that executes after the onload event. When prefetching scripts and stylesheets, an iframe must be used to avoid conflict with the JavaScript and CSS in the main page. Using an iframe makes this prefetching code more complex. A final burden is the processing required to parse prefetched scripts and stylesheets. The browser UI can freeze while prefetched scripts and stylesheets are parsed, even though this is unnecessary as they’re not going to be used in the current page. A simple alternative solution is to use LINK PREFETCH. Firefox is the only major browser that supports this feature (since 1.0). Wider support of LINK PREFETCH would give developers an easy way to accelerate their web pages. (test case)
- Web Timing spec
- In order for web developers to improve the performance of their web sites, they need to be able to measure their performance – specifically their page load times. There’s debate on the endpoint for measuring page load times (window onload event, first paint event, onDomReady), but most people agree that the starting point is when the web page is requested by the user. And yet, there is no reliable way for the owner of the web page to measure from this starting point. Google has submitted the Web Timing proposal draft for browser builtin support for measuring page load times to address these issues.
- remote JS debugging
- Developers strive to make their web apps fast across all major browsers, but this requires installing and learning a different toolset for each browser. In order to get cross-browser web development tools, browsers need to support remote JavaScript debugging. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. Agreement on the preferred protocol and support in the major browsers would go a long way to getting faster web apps for all users, and reducing the work for developers to maintain cross-browser web app performance.
- Web Sockets
- HTML5 Web Sockets provide built-in support for two-way communications between the client and server. The communication channel is accessible via JavaScript. Web Sockets are superior to comet and Ajax, especially in their compatibility with proxies and firewalls, and provide a path for building web apps with a high degree of communication between the browser and server.
- History
- HTML5 specifies implementation for History.pushState and History.replaceState. With these, web developers can dynamically change the URL to reflect the web application state without having to perform a page transition. This is important for Web 2.0 applications that modify the state of the web page using Ajax. Being able to avoid fetching a new HTML document to reflect these application changes results in a faster user experience.
- anchor ping
- The ping attribute for anchors provides a more performant way to track links. This is a controversial feature because of the association with “tracking” users. However, links are tracked today, it’s just done in a way that hurts the user experience. For example, redirects, synchronous XHR, and tight loops in unload handlers are some of the techniques used to ensure clicks are properly recorded. All of these create a slower user experience.
- progressive XHR
- The draft spec for XMLHttpRequest details how XHRs are to support progressive response handling. This is important for web apps that use data with varied response times as well as comet-style applications. (more information)
- stylesheet & inline JS
- When a stylesheet is followed by an inline script, resources that follow are blocked until the stylesheet is downloaded and the inline script is evaluated. Browsers should instead lookahead in their parsing and start downloading subsequent resources in parallel with the stylesheet. These resources of course would not be rendered, parsed, or evaluated until after the stylesheet was parsed and the inline script was evaluated. (test case see “|| CSS + Inline Script”; looks like this just landed in Firefox 3.6!)
- SCRIPT DEFER for inline scripts
- The benefit of the SCRIPT DEFER attribute for external scripts is discussed above. But DEFER is also useful for inline scripts that can be executed after the page has been parsed. Currently, IE8 supports this behavior. (test case)
- @import improvements
- @import is a popular alternative to the LINK tag for loading stylesheets, but it has several performance problems in IE:
- LINK @import – If the first stylesheet is loaded using LINK and the second one uses @import, they are loaded sequentially instead of in parallel. (test case)
- LINK blocks @import – If the first stylesheet is loaded using LINK, and the second stylesheet is loaded using LINK that contains @import, that @import stylesheet is blocked from downloading until the first stylesheet response is received. It would be better to start downloading the @import stylesheet immediately. (test case)
- many @imports – Using @import can change the download sequence of resources. In this test case, multiple stylesheets loaded with @import are followed by a script. Even though the script is listed last in the HTML document, it gets downloaded first. If the script takes a long time to download, it can causes the stylesheet downloads to be delayed, which can cause rendering to be delayed. It would be better to follow the order specified in the HTML document. (test case)
- @font-face improvements
- In IE8, if a script occurs before a style that uses @font-face, the page is blocked from rendering until the font file is done downloading. It would be better to render the rest of the page without waiting for the font file. (test case, blog post)
- stylesheets & iframes
- When an iframe is preceded by an external stylesheet, it blocks iframe downloads. In IE, the iframe is blocked from downloading until the stylesheet response is received. In Firefox, the iframe’s resources are blocked from downloading until the stylesheet response is received. There’s no dependency between the parent’s stylesheet and the iframe’s HTML document, so this blocking behavior should be removed. (test case)
- paint events
- As the amount of DOM elements and CSS grows, it’s becoming more important to be able to measure the performance of painting the page. Firefox 3.5 added the MozAfterPaint event which opened the door for add-ons like Firebug Paint Events (although early Firefox documentation noted that the “event might fire before the actual repainting happens“). Support for accurate paint events will allow developers to capture these metrics.
- missing schema, double downloads
- In IE7&8, if the “http:” schema is missing from a stylesheet’s URL, the stylesheet is downloaded twice. This makes the page render more slowly. Not including “http://” in URLs is not pervasive, but it’s getting more widely adopted because it reduces download size and resolves to “http://” or “https://” as appropriate. (test case)
5e speculative background images
This is the fifth of five quick posts about some browser quirks that have come up in the last few weeks.
Chrome and Safari start downloading background images before all styles are available. If a background image style gets overwritten this may cause wasteful downloads.
Background images are used everywhere: buttons, background wallpaper, rounded corners, etc. You specify a background image in CSS like so:
.bgimage { background-image: url("/images/button1.gif"); }
Downloading resources is an area for optimizing performance, so it’s important to understand what causes CSS background images to get downloaded. See if you can answer the following questions about button1.gif:
- Suppose no elements in the page use the class “bgimage”. Is button1.gif downloaded?
- Suppose an element in the page has the class “bgimage” but also has “display: none” or “visibility: hidden”. Is button1.gif downloaded?
- Suppose later in the page a stylesheet gets downloaded and redefines the “bgimage” class like this:
.bgimage { background-image: url("/images/button2.gif"); }Is button1.gif downloaded?
Ready?
The answer to question #1 is “no”. If no elements in the page use the rule, then the background image is not downloaded. This is true in all browsers that I’ve tested.
The answer to question #2 is “depends on the browser”. This might be surprising. Firefox 3.6 and Opera 10.10 do not download button1.gif, but the background image is downloaded in IE 8, Safari 4, and Chrome 4. I don’t have an explanation for this, but I do have a test page: hidden background images. If you have elements with background images that are hidden initially, you should hold off on creating them until after the visible content in the page is rendered.
The answer to question #3 is “depends on the browser”. I find this to be the most interesting behavior to investigate. According to the cascading behavior of CSS, the latter definition of the “bgimage” class should cause the background-image style to use button2.gif. And in all the major browsers this is exactly what happens. But Safari 4 and Chrome 4 are a little more aggressive about fetching background images. They download button1.gif on the speculation that the background-image property won’t be overwritten, and then later download button2.gif when it is overwritten. Here’s the test page: speculative background images.
When my officemate, Steve Lamm, pointed out this behavior to me, my first reaction was “that’s wasteful!” I love prefetching, but I’m not a big fan of most prefetching implementations because they’re too aggressive – they err too far on the side of downloading resources that never get used. After my initial reaction, I thought about this some more. How frequently would this speculative background image downloading be wasteful? I went on a search and couldn’t find any popular web site that overwrote the background-image style. Not one. I’m not saying pages like this don’t exist, I’m just saying it’s very atypical.
On the other hand, this speculative downloading of background images can really help performance and the user’s perception of page speed. Many web sites have multiple stylesheets. If background images don’t start downloading until all stylesheets are done loading, the page takes longer to render. Safari and Chrome’s behavior of downloading a background image as soon as an element needs it, even if one or more stylesheets are still downloading, is a nice performance optimization.
That’s a nice way to finish the week. Next week: my Browser Performance Wishlist.
The five posts in this series are:
Browser script loading roundup
How are browsers doing when it comes to parallel script loading?
Back in the days of IE7 and Firefox 2.0, no browser loaded scripts in parallel with other resources. Instead, these older browsers would block all subsequent resource requests until the script was received, parsed, and executed. Here’s how the HTTP requests look when this blocking occurs in older browsers:

The test page that generated this waterfall chart has six HTTP requests:
- the HTML document
- the 1st script – 2 seconds to download, 2 seconds to execute
- the 2nd script – 2 seconds to download, 2 seconds to execute
- an image – 1 second to download
- a stylesheet- 1 second to download
- an iframe – 1 second to download
The figure above shows how the scripts block each other and block the image, stylesheet, and iframe, as well. The image, stylesheet, and iframe download in parallel with each other, but not until the scripts are finished downloading sequentially.
The likely reason scripts were downloaded sequentially in older browsers was to preserve execution order. This is critical when code in the 2nd script depends on symbols defined in the 1st script. Preserving execution order avoids undefined symbol errors. But the missed opportunity is obvious – while the browser is downloading the first script and guaranteeing to execute it first, it could be downloading the other four resources in parallel.
Thankfully, newer browsers now load scripts in parallel!
This is a big win for today’s web apps that often contain 100K+ of JavaScript split across multiple files. Loading the same test page in IE8, Firefox 3.6, Chrome 4, and Safari 4 produces an HTTP waterfall chart like this:

Things look a lot better, but not as good as they should be. In this case, IE8 loads the two scripts and stylesheet in parallel, but the image and iframe are blocked. All of the newer browsers have similar limitations with regard to the extent to which they load scripts in parallel with other types of resources. This table from Browserscope shows where we are and the progress made to get to this point. The recently added “Compare” button added to Browserscope made it easy to generate this historical view.
While downloading scripts, IE8 still blocks on images and iframes. Chrome 4, Firefox 3.6, and Safari 4 block on iframes. Opera 10.10 blocks on all resource types. I’m confident parallel script loading will continue to improve based on the great progress made in the last batch of browsers. Let’s keep our eyes on the next browsers to see if things improve even more.
Speed Tracer – visibility into the browser
Is it just me, or does anyone else think Google’s on fire lately, lighting up the world of web performance? Quick review of news from the past two weeks:
- timeline and heap profiler added to Chrome Dev Tools
- Google Analytics publishes async script loading pattern
- latency and Page Speed recommendations added to Webmaster Tools
- deep dive into what makes Chrome (and browsers in general) fast
- Google Public DNS launched
- and now… the release of Speed Tracer
Speed Tracer was my highlight from last night’s Google Campfire One. The event celebrated the release of GWT 2.0. Performance and “faster” were emphasized again and again throughout the evening’s presentations (I love that). GWT’s new code splitting capabilities are great for performance, but Speed Tracer easily wowed the audience – including me. In this post, I’ll describe what I like about Speed Tracer, what I hope to see added next, and then I’ll step back and talk about the state of performance profilers.
Getting started with Speed Tracer
Some quick notes about Speed Tracer:
- It’s a Chrome extension, so it only runs in Chrome. (Chrome extensions is yet another announcement this week.)
- It’s written in GWT 2.0.
- It works on all web sites, even sites that don’t use GWT.
The Speed Tracer getting started page provides the details for installation. You have to be on the Chrome dev channel. Installing Speed Tracer adds a green stopwatch to the toolbar. Clicking on the icon starts Speed Tracer in a separate Chrome window. As you surf sites in the original window, the performance information is shown in the Speed Tracer window.

Beautiful visibility
When it comes to optimizing performance, developers have long been working in the dark. Without the ability to measure JavaScript execution, page layout, reflows, and HTML parsing, it’s not possible to optimize the pain points of today’s web apps. Speed Tracer gives developers visibility into these parts of page loading via the Sluggishness view, as shown here. (Click on the figure to see a full screen view.) Not only is this kind of visibility great, but the display is just, well, beautiful. Good UI and dev tools don’t often intersect, but when they do it makes development that much easier and more enjoyable.
Speed Tracer also has a Network view, with the requisite waterfall chart of HTTP requests. Performance hints are built into the tool flagging issues such as bad cache headers, exceedingly long responses, Mozilla cache hash collision, too many reflows, and uncompressed responses. Speed Tracer also supports saving and reloading the profiled information. This is extremely useful when working on bugs or analyzing performance with other team members.
Feature requests
I’m definitely going to be using Speed Tracer. For a first version, it’s extremely feature rich and robust. There are a few enhancements that will make it even stronger:
- overall pie chart – The “breakdown by time” for phases like script evaluation and layout are available for segments within a page load. As a starting point, I’d like to see the breakdown for the entire page. When drilling down on a specific load segment, this detail is great. But having overall stats will give developers a clue where they should focus most of their attention.
- network timing – Similar to the issues I discovered in Firebug Net Panel, long-executing JavaScript in the main page blocks the network monitor from accurately measuring the duration of HTTP requests. This will likely require changes to WebKit to record event times in the events themselves, as was done in the fix for Firefox.
- .HAR support – Being able to save Speed Tracer’s data to file and share it is great. Recently, Firebug, HttpWatch, and DebugBar have all launched support for the HTTP Archive file format I helped create. The format is extensible, so I hope to see Speed Tracer support the .HAR file format soon. Being able to share performance information across tools and browsers is a necessary next step. That’s a good segue…
Developers need more
Three years ago, there was only one tool for profiling web pages: Firebug. Developers love working in Firefox, but sometimes you just have to profile in Internet Explorer. Luckily, over the last year we’ve seen some good profilers come out for IE including MSFast , AOL Pagetest, WebPagetest.org, and dynaTrace Ajax Edition. DynaTrace’s tool is the most recent addition, and has great visibility similar to Speed Tracer, as well as JavaScript debugging capabilities. There have been great enhancements to Web Inspector, and the Chrome team has built on top of that adding timeline and memory profiling to Chrome. And now Speed Tracer is out and bubbling to the top of the heap.
The obvious question is:
Which tool should a developer choose?
But the more important question is:
Why should a developer have to choose?
There are eight performance profilers listed here. None of them work in more than a single browser. I realize web developers are exceedingly intelligent and hardworking, but no one enjoys having to use two different tools for the same task. But that’s exactly what developers are being asked to do. To be a good developer, you have to be profiling your web site in multiple browsers. By definition, that means you have to install, learn, and update multiple tools. In addition, there are numerous quirks to keep in mind when going from one tool to another. And the features offered are not consistent across tools. It’s a real challenge to verify that your web app performs well across the major browsers. When pressed, rock star web developers I ask admit they only use one or two profilers – it’s just too hard to stay on top of a separate tool for each browser.
This week at Add-on-Con, Doug Crockford’s closing keynote is about the Future of the Web Browser. He’s assembled a panel of representatives from Chrome, Opera, Firefox, and IE. (Safari declined to attend.) My hope is they’ll discuss the need for a cross-browser extension model. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. My hope for 2010 is that we see cross-browser convergence on standards for extensions and remote debugging, so that developers will have a slightly easier path for ensuring their apps are high performance on all browsers.
(down)Loading JavaScript as strings
The Gmail mobile team and Charles Jolley from SproutCore have recently published some interesting techniques for loading JavaScript in a deferred manner. Anyone building performant web apps is familiar with the pain inflicted when loading JavaScript. These new techniques are great patterns. Let me expand on how they work and the context for using them. FYI – Charles is presenting this technique at tomorrow’s Velocity Online Conference. Check that out if you’re interested in finding out more and asking him questions.
When to defer JavaScript loading
I’ve spent much of the last two years researching and evangelizing techniques for loading scripts without blocking. These techniques address the situation where you need to load external scripts to render the initial page. But not all JavaScript is necessary for loading the initial page. Most Web 2.0 apps include JavaScript that’s only used later in the session, depending on what the user clicks on (dropdown menus, popup DIVs, Ajax actions, etc.). In fact, the Alexa top ten only use 25% of the downloaded JavaScript to load the initial page (see Splitting the Initial Payload).
The performance optimization resulting from this observation is clear – defer the loading of JavaScript that’s not part of initial page rendering. But how?
Deferred loading is certainly achievable using the non-blocking techniques I’ve researched – but my techniques might not be the best choice for this yet-to-be-used JavaScript code. Here’s why: Suppose you have 300K of JavaScript that can be deferred (it’s not used to render the initial page). When you load this script later using my techniques, the UI locks up while the browser parses and executes that 300K of code. We’ve all experienced this in certain web apps. After the web app initially loads, clicking on a link doesn’t do anything. In extreme situations, the browser’s tab icon stops animating. Not a great user experience.
If you’re certain that code is going to be used, then so be it – parse and execute the code when it’s downloaded using my techniques. But in many situations, the user many never exercise all of this deferred code. She might not click on any of the optional features, or she might only use a subset of them.
Is there a way to download this code in a deferred way, without locking up the browser UI?
Deferred loading without locking up the UI
I recently blogged about a great optimization used in mobile Gmail for loading JavaScript in a deferred manner: Mobile Gmail and async script loading. That team was acutely aware of how loading JavaScript in the background locked up mobile browsers. The technique they came up with was to wrap the JavaScript in comments. This allows the code to be downloaded, but avoids the CPU lockup for parsing and execution. Later, when the user clicks on a feature that needs code, a cool dynamic technique is used to extract the code from the comments and eval it.
This technique has many benefits. It gets the download delays out of the way, so the code is already in the client if and when the user needs it. This technique avoids the CPU load for parsing and executing the code – this can be significant given the size of JavaScript payloads in today’s web apps. One downside of this technique results from cross-site scripting restrictions – the commented out code must be in the main page or in an iframe.
This is where Charles Jolley (from the SproutCore team) started his investigation. He wanted a technique that was more flexible and worked across domains. He presents his new technique (along with results from experiments) in two blog posts: Faster Loading Through Eval() and Cut Your JavaScript Load Time 90% with Deferred Evaluation. This new technique is to capture the deferred JavaScript as strings which can be downloaded with negligible parsing time. Later, when the user triggers a feature, the relevant code strings are eval’ed.
His experiment includes three scenarios for loading jQuery:
- Baseline – load jQuery like normal via script tag. jQuery is parsed and executed immediately on load.
- Closure – load jQuery in a closure but don’t actually execute the closure until after the onload event fires. This essentially means the jQuery code will be parsed but not executed until later.
- String – load jQuery as a giant string. After the onload event fires, eval() the string to actually make jQuery ready for use.
The results are promising and somewhat surprising – in a good way. (Note: results for IE are TBD.)
Charles reports two time measurements.
- The load time (blue) is how long it takes for the onload event to fire. No surprise – avoiding execution (“Closure”) results in a faster load time than normal script loading, and avoiding parsing and execution (“String”) allows the page to load even faster.
- The interesting and promising stat is the setup time (green) – how long it takes for the deferred code to be fully parsed and executed. The importance of this measurement is to see if using eval has penalties compared to the normal way of loading scripts. It turns out that in WebKit, Firefox, and iPhone there isn’t a significant cost for doing eval. Chrome is a different story and needs further investigation.
These techniques for deferred loading of JavaScript are great additions to have for optimizing web site performance. The results for IE are still to come from Charles, and will be the most important for gauging the applicability of this technique. Charles is presenting this technique at tomorrow’s Velocity Online Conference. I’m hoping he’ll have the IE results to give us the full picture on how this technique performs.
How browsers work
My initial work on the Web was on the backend – C++, Java, databases, Apache, etc. In 2005, I started focusing on web performance. To get a better idea of what made them slow, I surfed numerous web sites with a packet sniffer open. That’s when I discovered that a bulk of the time spent loading a web site occurs on the frontend, after the HTML document arrives at the browser.
Not knowing much about how the frontend worked, I spent a week searching for anything that could explain what was going on in the browser. The gem that I found was David Hyatt’s blog post entitled Testing Page Load Speed. His article opened my eyes to the complexity of what the browser does, and launched my foray into finding ways to optimize page load times resulting in things like YSlow and High Performance Web Sites.
Today’s post on the Chromium Blog (Technically speaking, what makes Google Chrome fast?), contains a similar gem. Mike Belshe, Chrome developer and co-creator of SPDY, talks about the performance optimizations inside of Chrome. But in so doing, he also reveals insights into how all browsers work and the challenges they face. For example, until I saw this, I didn’t have a real appreciation for the performance impact of DOM bindings – the connections between the JavaScript that modifies web pages and the C++ that implements the browser. He also talks about garbage collection, concurrent connections, lookahead parsing and downloading, domain sharding, and multiple processes.
Take 16.5 minutes and watch Mike’s video. It’s well worth it.


