Frontend SPOF
My evangelism of high performance web sites started off in the context of quality code and development best practices. It’s easy for a style of coding to permeate throughout a company. Developers switch teams. Code is copied and pasted (especially in the world of web development). If everyone is developing in a high performance way, that’s the style that will characterize how the company codes.
This argument of promoting development best practices gained traction in the engineering quarters of the companies I talked to, but performance improvements continued to get backburnered in favor of new features and content that appealed to the business side of the organization. Improving performance wasn’t considered as important as other changes. Everyone assumed users wanted new features and that’s what got the most attention.
It became clear to me that we needed to show a business case for web performance. That’s why the theme for Velocity 2009 was “the impact of performance on the bottom line”. Since then there have been numerous studies released that have shown that improving performance does improve the bottom line. As a result, I’m seeing the business side of many web companies becoming strong advocates for Web Performance Optimization.
But there are still occasions when I have a hard time convincing a team that focusing on web performance, specifically frontend performance, is important. Shaving off hundreds (or even thousands) of milliseconds just doesn’t seem worthwhile to them. That’s when I pull out the big guns and explain that loading scripts and stylesheets in the typical way creates a frontend single point of failure that can bring down the entire site.
Examples of Frontend SPOF
The thought that simply adding a script or stylesheet to your web page could make the entire site unavailable surprises many people. Rather than focusing on CSS mistakes and JavaScript errors, the key is to think about what happens when a resource request times out. With this clue, it’s easy to create a test case:
<html> <head> <script src="http://www.snippet.com/main.js" type="text/javascript"> </script> </head> <body> Here's my page! </body> </html>
This HTML page looks pretty normal, but if snippet.com is overloaded the entire page is blank waiting for main.js to return. This is true in all browsers.
Here are some examples of frontend single points of failure and the browsers they impact. You can click on the Frontend SPOF test links to see the actual test page.
| Frontend SPOF test | Chrome | Firefox | IE | Opera | Safari |
|---|---|---|---|---|---|
| External Script | blank below | blank below | blank below | blank below | blank below |
| Stylesheet | flash | flash | blank below | flash | blank below |
| inlined @font-face | delayed | flash | flash | flash | delayed |
| Stylesheet with @font-face | delayed | flash | totally blank* | flash | delayed |
| Script then @font-face | delayed | flash | totally blank* | flash | delayed |
* Internet Explorer 9 does not display a blank page, but does “flash” the element.
The failure cases are highlighted in red. Here are the four possible outcomes sorted from worst to best:
- totally blank – Nothing in the page is rendered – the entire page is blank.
- blank below – All the DOM elements below the resource in question are not rendered.
- delayed – Text that uses the @font-face style is invisible until the font file arrives.
- flash – DOM elements are rendered immediately, and then redrawn if necessary after the stylesheet or font has finished downloading.
Web Performance avoids SPOF
It turns out that there are web performance best practices that, in addition to making your pages faster, also avoid most of these frontend single points of failure. Let’s look at the tests one by one.
- External Script
- All browsers block rendering of elements below an external script until the script arrives and is parsed and executed. Since many sites put scripts in the HEAD, this means the entire page is typically blank. That’s why I believe the most important web performance coding pattern for today’s web sites is to load JavaScript asynchronously. Not only does this improve performance, but it avoids making external scripts a possible SPOF.
- Stylesheet
- Browsers are split on how they handle stylesheets. Firefox and Opera charge ahead and render the page, and then flash the user if elements have to be redrawn because their styling changed. Chrome, Internet Explorer, and Safari delay rendering the page until the stylesheets have arrived. (Generally they only delay rendering elements below the stylesheet, but in some cases IE will delay rendering everything in the page.) If rendering is blocked and the stylesheet takes a long time to download, or times out, the user is left staring at a blank page. There’s not a lot of advice on loading stylesheets without blocking page rendering, primarily because it would introduce the flash of unstyled content.
- inlined @font-face
- I’ve blogged before about the performance implications of using @font-face. When the @font-face style is declared in a STYLE block in the HTML document, the SPOF issues are dramatically reduced. Firefox, Internet Explorer, and Opera avoid making these custom font files a SPOF by rendering the affected text and then redrawing it after the font file arrives. Chrome and Safari don’t render the customized text at all until the font file arrives. I’ve drawn these cells in yellow since it could cause the page to be unusable for users using these browsers, but most sites only use custom fonts on a subset of the page.
- Stylesheet with @font-face
- Inlining your @font-face style is the key to avoiding having font files be a single point of failure. If you inline your @font-face styles and the font file takes forever to return or times out, the worst case is the affected text is invisible in Chrome and Safari. But at least the rest of the page is visible, and everything is visible in Firefox, IE, and Opera. Moving the @font-face style to a stylesheet not only slows down your site (by requiring two sequential downloads to render text), but it also creates a special case in Internet Explorer 7 & 8 where the entire page is blocked from rendering. IE 6 is only slightly better – the elements below the stylesheet are blocked from rendering (but if your stylesheet is in the HEAD this is the same outcome).
- Script then @font-face
- Inlining your @font-face style isn’t enough to avoid the entire page SPOF that occurs in IE. You also have to make sure the inline STYLE block isn’t preceded by a SCRIPT tag. Otherwise, your entire page is blank in IE waiting for the font file to arrive. If that file is slow to return, your users are left staring at a blank page.
SPOF is bad
Five years ago most of the attention on web performance was focused on the backend. Since then we’ve learned that 80% of the time users wait for a web page to load is the responsibility of the frontend. I feel this same bias when it comes to identifying and guarding against single points of failure that can bring down a web site – the focus is on the backend and there’s not enough focus on the frontend. For larger web sites, the days of a single server, single router, single data center, and other backend SPOFs are way behind us. And yet, most major web sites include scripts and stylesheets in the typical way that creates a frontend SPOF. Even more worrisome – many of these scripts are from third parties for social widgets, web analytics, and ads.
Look at the scripts, stylesheets, and font files in your web page from a worst case scenario perspective. Ask yourself:
- Is your web site’s availability dependent on these resources?
- Is it possible that if one of these resources timed out, users would be blocked from seeing your site?
- Are any of these single point of failure resources from a third party?
- Would you rather embed resources in a way that avoids making them a frontend SPOF?
Make sure you’re aware of your frontend SPOFs, track their availability and latency closely, and embed them in your page in a non-blocking way whenever possible.
Browser Performance Wishlist
What are the most important changes browsers could make to improve performance?
This document is my answer to that question. This is mainly for browser developers, although web developers will want to track the adoption of these improvements.
- download scripts without blocking
- SCRIPT attributes
- resource packages
- border-radius
- cache redirects
- link prefetch
- Web Timing spec
- remote JS debugging
- Web Sockets
- History
- anchor ping
- progressive XHR
- stylesheet & inline JS
- SCRIPT DEFER for inline scripts
- @import improvements
- @font-face improvements
- stylesheets & iframes
- paint events
- missing schema, double downloads
Before digging into the list I wanted to mention two items that would actually be at the top of the list if it wasn’t for how new they are: SPDY and FRAG tag. Both of these require industry adoption and possible changes to specifications, so it’s too soon to put them on an implementation wishlist. I hope these ideas gain consensus soon and to facilitate that I describe them here.
- SPDY
- SPDY is a proposal from Google for making three major improvements to HTTP: compressed headers, multiplexed requests, and prioritized responses. Initial studies showed 25 top sites were loaded 55% faster. Server and client implementations are available, and some other organizations and individuals have completed server and client implementations. The protocol draft has been published for review.
- FRAG tag
- The idea behind this “document fragment” tag is that it be used to wrap 3rd party content – ads, widgets, and analytics. 3rd party content can have a severe impact on the containing page’s performance due to additional HTTP requests, scripts that block rendering and downloads, and added DOM nodes. Many of these factors can be mitigated by putting the 3rd party content inside an iframe embedded in the top level HTML document. But iframes have constraints and drawbacks – they typically introduce another HTTP request for the iframe’s HTML document, not all 3rd party code snippets will work inside an iframe without changes (e.g., references to “document” in JavaScript might need to reference the parent document), and some snippets (expando ads, suggest) can’t float over the main page’s elements. Another path to mitigate these issues is to load the JavaScript asynchronously, but many of these widgets use document.write and so must be evaluated synchronously.
A compromise is to place 3rd party content in the top level HTML document wrapped in a FRAG block. This approach degrades nicely – older browsers would ignore the FRAG tag and handle these snippets the same way they do today. Newer browsers would parse the HTML in a separate document fragment. The FRAG content would not block the rendering of the top level document. Snippets containing document.write would work without blocking the top level document. This idea just started getting discussed in January 2010. Much more use case analysis and discussion is needed, culminating in a proposed specification. (Credit to Alex Russell for the idea and name.)
The List
The performance wishlist items are sorted highest priority first. The browser icons indicate which browsers need to implement that particular improvement.
- download scripts without blocking
- In older browsers, once a script started downloading all subsequent downloads were blocked until the script returned. It’s critical that scripts be evaluated in the order specified, but they can be downloaded in parallel. This has a significant improvement on page load times, especially for pages with multiple scripts. Newer browsers (IE8, Firefox 3.5+, Safari 4, Chrome 2+) incorporated this parallel script loading feature, but it doesn’t work as proactively as it could. Specifically:
- IE8 – downloading scripts blocks image and iframe downloads
- Firefox 3.6 – downloading scripts blocks iframe downloads
- Safari 4 – downloading scripts blocks iframe downloads
- Chrome 4 – downloading scripts blocks iframe downloads
- Opera 10.10 – downloading scripts blocks all downloads
(test case, see the four “|| Script [Script|Stylesheet|Image|Iframe]” tests)
- SCRIPT attributes
- The HTML5 specification describes the ASYNC and DEFER attributes for the SCRIPT tag, but the implementation behavior is not specified. Here’s how the SCRIPT attributes should work.
- DEFER – The HTTP request for a SCRIPT with the DEFER attribute is not made until all other resources in the page on the same domain have already been sent. This is so that it doesn’t occupy one of the limited number of connections that are opened for a single server. Deferred scripts are downloaded in parallel, but are executed in the order they occur in the HTML document, regardless of what order the responses arrive in. The window’s onload event fires after all deferred scripts are downloaded and executed.
- ASYNC – The HTTP request for a SCRIPT with the ASYNC attribute is made immediately. Async scripts are executed as soon as the response is received, regardless of the order they occur in the HTML document. The window’s onload event fires after all async scripts are downloaded and executed.
- POSTONLOAD – This is a new attribute I’m proposing. Postonload scripts don’t start downloading until after the window’s onload event has fired. By default, postonload scripts are evaluated in the order they occur in the HTML document. POSTONLOAD and ASYNC can be used in combination to cause postonload scripts to be evaluated as soon as the response is received, regardless of the order they occur in the HTML document.
- resource packages
- Each HTTP request has some overhead cost. Workarounds include concatenating scripts, concatenating stylesheets, and creating image sprites. But this still results in multiple HTTP requests. And sprites are especially difficult to create and maintain. Alexander Limi (Mozilla) has proposed using zip files to create resource packages. It’s a good idea because of its simplicity and graceful degradation.
- border-radius
- Creating rounded corners leads to code bloat and excessive HTTP requests. Border-radius reduces this to a simple CSS style. The only major browser that doesn’t support border-radius is IE. It has already been announced that IE9 will support border-radius, but I wanted to include it nevertheless.
- cache redirects
- Redirects are costly from a performance perspective, especially for users with high latency. Although the HTTP spec says 301 and 302 responses (with the proper HTTP headers) are cacheable, most browsers don’t support this.
- IE8 – doesn’t cache redirects for the main page and for resources
- Safari 4 – doesn’t cache redirects for the main page
- Opera 10.10 – doesn’t cache redirects for the main page
- link prefetch
- To improve page load times, developers prefetch resources that are likely or certain to be used later in the user’s session. This typically involves writing JavaScript code that executes after the onload event. When prefetching scripts and stylesheets, an iframe must be used to avoid conflict with the JavaScript and CSS in the main page. Using an iframe makes this prefetching code more complex. A final burden is the processing required to parse prefetched scripts and stylesheets. The browser UI can freeze while prefetched scripts and stylesheets are parsed, even though this is unnecessary as they’re not going to be used in the current page. A simple alternative solution is to use LINK PREFETCH. Firefox is the only major browser that supports this feature (since 1.0). Wider support of LINK PREFETCH would give developers an easy way to accelerate their web pages. (test case)
- Web Timing spec
- In order for web developers to improve the performance of their web sites, they need to be able to measure their performance – specifically their page load times. There’s debate on the endpoint for measuring page load times (window onload event, first paint event, onDomReady), but most people agree that the starting point is when the web page is requested by the user. And yet, there is no reliable way for the owner of the web page to measure from this starting point. Google has submitted the Web Timing proposal draft for browser builtin support for measuring page load times to address these issues.
- remote JS debugging
- Developers strive to make their web apps fast across all major browsers, but this requires installing and learning a different toolset for each browser. In order to get cross-browser web development tools, browsers need to support remote JavaScript debugging. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. Agreement on the preferred protocol and support in the major browsers would go a long way to getting faster web apps for all users, and reducing the work for developers to maintain cross-browser web app performance.
- Web Sockets
- HTML5 Web Sockets provide built-in support for two-way communications between the client and server. The communication channel is accessible via JavaScript. Web Sockets are superior to comet and Ajax, especially in their compatibility with proxies and firewalls, and provide a path for building web apps with a high degree of communication between the browser and server.
- History
- HTML5 specifies implementation for History.pushState and History.replaceState. With these, web developers can dynamically change the URL to reflect the web application state without having to perform a page transition. This is important for Web 2.0 applications that modify the state of the web page using Ajax. Being able to avoid fetching a new HTML document to reflect these application changes results in a faster user experience.
- anchor ping
- The ping attribute for anchors provides a more performant way to track links. This is a controversial feature because of the association with “tracking” users. However, links are tracked today, it’s just done in a way that hurts the user experience. For example, redirects, synchronous XHR, and tight loops in unload handlers are some of the techniques used to ensure clicks are properly recorded. All of these create a slower user experience.
- progressive XHR
- The draft spec for XMLHttpRequest details how XHRs are to support progressive response handling. This is important for web apps that use data with varied response times as well as comet-style applications. (more information)
- stylesheet & inline JS
- When a stylesheet is followed by an inline script, resources that follow are blocked until the stylesheet is downloaded and the inline script is evaluated. Browsers should instead lookahead in their parsing and start downloading subsequent resources in parallel with the stylesheet. These resources of course would not be rendered, parsed, or evaluated until after the stylesheet was parsed and the inline script was evaluated. (test case see “|| CSS + Inline Script”; looks like this just landed in Firefox 3.6!)
- SCRIPT DEFER for inline scripts
- The benefit of the SCRIPT DEFER attribute for external scripts is discussed above. But DEFER is also useful for inline scripts that can be executed after the page has been parsed. Currently, IE8 supports this behavior. (test case)
- @import improvements
- @import is a popular alternative to the LINK tag for loading stylesheets, but it has several performance problems in IE:
- LINK @import – If the first stylesheet is loaded using LINK and the second one uses @import, they are loaded sequentially instead of in parallel. (test case)
- LINK blocks @import – If the first stylesheet is loaded using LINK, and the second stylesheet is loaded using LINK that contains @import, that @import stylesheet is blocked from downloading until the first stylesheet response is received. It would be better to start downloading the @import stylesheet immediately. (test case)
- many @imports – Using @import can change the download sequence of resources. In this test case, multiple stylesheets loaded with @import are followed by a script. Even though the script is listed last in the HTML document, it gets downloaded first. If the script takes a long time to download, it can causes the stylesheet downloads to be delayed, which can cause rendering to be delayed. It would be better to follow the order specified in the HTML document. (test case)
- @font-face improvements
- In IE8, if a script occurs before a style that uses @font-face, the page is blocked from rendering until the font file is done downloading. It would be better to render the rest of the page without waiting for the font file. (test case, blog post)
- stylesheets & iframes
- When an iframe is preceded by an external stylesheet, it blocks iframe downloads. In IE, the iframe is blocked from downloading until the stylesheet response is received. In Firefox, the iframe’s resources are blocked from downloading until the stylesheet response is received. There’s no dependency between the parent’s stylesheet and the iframe’s HTML document, so this blocking behavior should be removed. (test case)
- paint events
- As the amount of DOM elements and CSS grows, it’s becoming more important to be able to measure the performance of painting the page. Firefox 3.5 added the MozAfterPaint event which opened the door for add-ons like Firebug Paint Events (although early Firefox documentation noted that the “event might fire before the actual repainting happens“). Support for accurate paint events will allow developers to capture these metrics.
- missing schema, double downloads
- In IE7&8, if the “http:” schema is missing from a stylesheet’s URL, the stylesheet is downloaded twice. This makes the page render more slowly. Not including “http://” in URLs is not pervasive, but it’s getting more widely adopted because it reduces download size and resolves to “http://” or “https://” as appropriate. (test case)
5e speculative background images
This is the fifth of five quick posts about some browser quirks that have come up in the last few weeks.
Chrome and Safari start downloading background images before all styles are available. If a background image style gets overwritten this may cause wasteful downloads.
Background images are used everywhere: buttons, background wallpaper, rounded corners, etc. You specify a background image in CSS like so:
.bgimage { background-image: url("/images/button1.gif"); }
Downloading resources is an area for optimizing performance, so it’s important to understand what causes CSS background images to get downloaded. See if you can answer the following questions about button1.gif:
- Suppose no elements in the page use the class “bgimage”. Is button1.gif downloaded?
- Suppose an element in the page has the class “bgimage” but also has “display: none” or “visibility: hidden”. Is button1.gif downloaded?
- Suppose later in the page a stylesheet gets downloaded and redefines the “bgimage” class like this:
.bgimage { background-image: url("/images/button2.gif"); }Is button1.gif downloaded?
Ready?
The answer to question #1 is “no”. If no elements in the page use the rule, then the background image is not downloaded. This is true in all browsers that I’ve tested.
The answer to question #2 is “depends on the browser”. This might be surprising. Firefox 3.6 and Opera 10.10 do not download button1.gif, but the background image is downloaded in IE 8, Safari 4, and Chrome 4. I don’t have an explanation for this, but I do have a test page: hidden background images. If you have elements with background images that are hidden initially, you should hold off on creating them until after the visible content in the page is rendered.
The answer to question #3 is “depends on the browser”. I find this to be the most interesting behavior to investigate. According to the cascading behavior of CSS, the latter definition of the “bgimage” class should cause the background-image style to use button2.gif. And in all the major browsers this is exactly what happens. But Safari 4 and Chrome 4 are a little more aggressive about fetching background images. They download button1.gif on the speculation that the background-image property won’t be overwritten, and then later download button2.gif when it is overwritten. Here’s the test page: speculative background images.
When my officemate, Steve Lamm, pointed out this behavior to me, my first reaction was “that’s wasteful!” I love prefetching, but I’m not a big fan of most prefetching implementations because they’re too aggressive – they err too far on the side of downloading resources that never get used. After my initial reaction, I thought about this some more. How frequently would this speculative background image downloading be wasteful? I went on a search and couldn’t find any popular web site that overwrote the background-image style. Not one. I’m not saying pages like this don’t exist, I’m just saying it’s very atypical.
On the other hand, this speculative downloading of background images can really help performance and the user’s perception of page speed. Many web sites have multiple stylesheets. If background images don’t start downloading until all stylesheets are done loading, the page takes longer to render. Safari and Chrome’s behavior of downloading a background image as soon as an element needs it, even if one or more stylesheets are still downloading, is a nice performance optimization.
That’s a nice way to finish the week. Next week: my Browser Performance Wishlist.
The five posts in this series are:
5d dynamic stylesheets
This is the fourth of five quick posts about some browser quirks that have come up in the last few weeks.
You can avoid blocking rendering in IE if you load stylesheets using DHTML and setTimeout.
A few weeks ago I had a meeting with a company that makes a popular widget. One technique they used to reduce their widget’s impact on the main page was to load a stylesheet dynamically, something like this:
var link = document.createElement('link');
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = '/main.css';
document.getElementsByTagName('head')[0].appendChild(link);
Most of my attention for the past year has been on loading scripts dynamically to avoid blocking downloads. I haven’t focused on loading stylesheets dynamically. When it comes to stylesheets, blocking downloads isn’t an issue – stylesheets don’t block downloads (except in Firefox 2.0). The thing to worry about when downloading stylesheets is that IE blocks rendering until all stylesheets are downloaded1, and other browsers might experience a Flash Of Unstyled Content (FOUC).
FOUC isn’t a concern for this widget – the rules in the dynamically-loaded stylesheet only apply to the widget, and the widget hasn’t been created yet so nothing can flash. If the point of loading the stylesheet dynamically is to not mess with the containing page, we have to make sure dynamic stylesheets don’t block the page from rendering in IE.
I created the DHTML stylesheet example to show what happens. The page loads a stylesheet dynamically. The stylesheet is configured to take 4 seconds to download. If you load the page in Internet Explorer the page is blank for 4 seconds. In order to decouple the stylesheet load from page rendering, the DHTML code has to be invoked using setTimeout. That’s what I do in the DHTML + setTimeout stylesheet test page. This works. The page renders immediately while the stylesheet is downloaded in the background.
This technique is applicable when you have stylesheets that you want to load in the page but the stylesheet’s rules don’t apply to any DOM elements in the page currently. This is a pretty small use case. It makes sense for widgets or pages that have DHTML features that aren’t invoked until after the page has loaded. If you find yourself in that situation, you can use this technique to avoid the blank white screen in IE.
The five posts in this series are:
- 5a Missing schema double download
- 5b document.write scripts block in Firefox
- 5c media=print stylesheets
- 5d dynamic stylesheets
- 5e speculative background images
| 1 | Simple test pages may not reproduce this problem. My testing shows that you need a script (inline or external) above the stylesheet, or two or more stylesheets for rendering to be blocked. If your page has only one stylesheet and no SCRIPT tags, you might not experience this issue. |
5c media=print stylesheets
This is the third of five quick posts about some browser quirks that have come up in the last few weeks.
Stylesheets set with media=”print” still block rendering in Internet Explorer.
A few weeks ago a friend at a top web company pinged me about a possible bug in Page Speed and YSlow. Both tools were complaining about stylesheets he placed at the bottom of his page, an obvious violation of my put stylesheets at the top rule from High Performance Web Sites. The reasoning behind this rule is that Internet Explorer won’t start rendering the page until all stylesheets are downloaded1, and other browsers might produce the Flash Of Unstyled Content (FOUC). It’s best to put stylesheets at the top so they get downloaded as soon as possible.
His reason for putting these stylesheets at the bottom was that they were specified with media="print". Since these stylesheets weren’t going to be used to render the current page, he wanted to load them last so that other more important resources could get downloaded sooner. Going back to the reasons for the “put stylesheets at the top” rule, he wouldn’t have to worry about FOUC (the stylesheets wouldn’t be applied to the current page). But would he have to worry about IE blocking the page from rendering? Time for a test page.
The media=print stylesheets test page contains one stylesheet at the bottom with media="print". This stylesheet is configured to take 4 seconds to download. If you view this page in Internet Explorer you’ll see that rendering is indeed blocked for 4 seconds (tested on IE 6, 7, & 8).
I’m surprised browsers haven’t gotten to the point where they skip downloading stylesheets for a different media type than the current one. I’ve asked some web devs but no one can think of a good reason for doing this. In the meantime, even if you have stylesheets with media="print" you might want to follow the advice of Page Speed and YSlow and put them in the document HEAD. Or you could try loading them dynamically. That’s the topic I’ll cover in my next blog post.
The five posts in this series are:
- 5a Missing schema double download
- 5b document.write scripts block in Firefox
- 5c media=print stylesheets
- 5d dynamic stylesheets
- 5e speculative background images
| 1 | Simple test pages may not reproduce this problem. My testing shows that you need a script (inline or external) above the stylesheet, or two or more stylesheets for rendering to be blocked. If your page has only one stylesheet and no SCRIPT tags, you might not experience this issue. |
5a Missing schema double download
This is the first of five quick posts about some browser quirks that have come up in the last few weeks.
Internet Explorer 7 & 8 will download stylesheets twice if the http(s) protocol is missing.
If you have an HTTPS page that loads resources with “http://” in the URL, IE halts the download and displays an error dialog. This is called mixed content and should be avoided. How should developers code their URLs to avoid this problem? You could do it on the backend in your HTML template language. But a practice that is getting wider adoption is protocol relative URLs.
A protocol relative URL doesn’t contain a protocol. For example,
http://stevesouders.com/images/book-84x110.jpgbecomes
//stevesouders.com/images/book-84x110.jpgBrowsers substitute the protocol of the page itself for the resource’s missing protocol. Problem solved! In fact, today’s HttpWatch Blog posted about this: Using Protocol Relative URLs to Switch between HTTP and HTTPS.
However, if you try this in Internet Explorer 7 and 8 you’ll see that stylesheets specified with a protocol relative URL are downloaded twice. Hard to believe, but true. My officemate, Steve Lamm, discovered this when looking at the new Nexus One Phone page. That page fetches a stylesheet like this:
<link type="text/css" rel="stylesheet" href="//www.google.com/phone/static/2496921881-SiteCss.css">Notice there’s no protocol. If you load this page in Internet Explorer 7 and 8 the waterfall chart (nicely generated by HttpWatch) looks like this:

Notice 2496921881-SiteCss.css is downloaded twice, and each time it’s a 200 response, so it’s not being read from cache.
It turns out this only happens with stylesheets. The Missing schema, double download test page I created contains a stylesheet, an image, and a script that all have protocol relative URLs pointing to 1.cuzillion.com. The stylesheet is downloaded twice, but the image and script are only downloaded once. I added another stylesheet from 2.cuzillion.com that has a full URL (i.e., it starts with “http:”). This stylesheet is only downloaded once.
Developers should avoid using protocol relative URLs for stylesheets if they want their pages to be as fast as possible in Internet Explorer 7 & 8.
The five posts in this series are:
Browser script loading roundup
How are browsers doing when it comes to parallel script loading?
Back in the days of IE7 and Firefox 2.0, no browser loaded scripts in parallel with other resources. Instead, these older browsers would block all subsequent resource requests until the script was received, parsed, and executed. Here’s how the HTTP requests look when this blocking occurs in older browsers:

The test page that generated this waterfall chart has six HTTP requests:
- the HTML document
- the 1st script – 2 seconds to download, 2 seconds to execute
- the 2nd script – 2 seconds to download, 2 seconds to execute
- an image – 1 second to download
- a stylesheet- 1 second to download
- an iframe – 1 second to download
The figure above shows how the scripts block each other and block the image, stylesheet, and iframe, as well. The image, stylesheet, and iframe download in parallel with each other, but not until the scripts are finished downloading sequentially.
The likely reason scripts were downloaded sequentially in older browsers was to preserve execution order. This is critical when code in the 2nd script depends on symbols defined in the 1st script. Preserving execution order avoids undefined symbol errors. But the missed opportunity is obvious – while the browser is downloading the first script and guaranteeing to execute it first, it could be downloading the other four resources in parallel.
Thankfully, newer browsers now load scripts in parallel!
This is a big win for today’s web apps that often contain 100K+ of JavaScript split across multiple files. Loading the same test page in IE8, Firefox 3.6, Chrome 4, and Safari 4 produces an HTTP waterfall chart like this:

Things look a lot better, but not as good as they should be. In this case, IE8 loads the two scripts and stylesheet in parallel, but the image and iframe are blocked. All of the newer browsers have similar limitations with regard to the extent to which they load scripts in parallel with other types of resources. This table from Browserscope shows where we are and the progress made to get to this point. The recently added “Compare” button added to Browserscope made it easy to generate this historical view.
While downloading scripts, IE8 still blocks on images and iframes. Chrome 4, Firefox 3.6, and Safari 4 block on iframes. Opera 10.10 blocks on all resource types. I’m confident parallel script loading will continue to improve based on the great progress made in the last batch of browsers. Let’s keep our eyes on the next browsers to see if things improve even more.
Loading Scripts Without Blocking
This post is based on a chapter from Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.
As more and more sites evolve into “Web 2.0″ apps, the amount of JavaScript increases. This is a performance concern because scripts have a negative impact on page performance. Mainstream browsers (i.e., IE 6 and 7) block in two ways:
- Resources in the page are blocked from downloading if they are below the script.
- Elements are blocked from rendering if they are below the script.
The Scripts Block Downloads example demonstrates this. It contains two external scripts followed by an image, a stylesheet, and an iframe. The HTTP waterfall chart from loading this example in IE7 shows that the first script blocks all downloads, then the second script blocks all downloads, and finally the image, stylesheet, and iframe all download in parallel. Watching the page render, you’ll notice that the paragraph of text above the script renders immediately. However, the rest of the text in the HTML document is blocked from rendering until all the scripts are done loading.

Scripts block downloads in IE6&7, Firefox 2&3.0, Safari 3, Chrome 1, and Opera
Browsers are single threaded, so it’s understandable that while a script is executing the browser is unable to start other downloads. But there’s no reason that while the script is downloading the browser can’t start downloading other resources. And that’s exactly what newer browsers, including Internet Explorer 8, Safari 4, and Chrome 2, have done. The HTTP waterfall chart for the Scripts Block Downloads example in IE8 shows the scripts do indeed download in parallel, and the stylesheet is included in that parallel download. But the image and iframe are still blocked. Safari 4 and Chrome 2 behave in a similar way. Parallel downloading improves, but is still not as much as it could be.

Scripts still block, even in IE8, Safari 4, and Chrome 2
Fortunately, there are ways to get scripts to download without blocking any other resources in the page, even in older browsers. Unfortunately, it’s up to the web developer to do the heavy lifting.
There are six main techniques for downloading scripts without blocking:
- XHR Eval – Download the script via XHR and
eval()the responseText. - XHR Injection – Download the script via XHR and inject it into the page by creating a script element and setting its
textproperty to the responseText. - Script in Iframe – Wrap your script in an HTML page and download it as an iframe.
- Script DOM Element – Create a script element and set its
srcproperty to the script’s URL. - Script Defer – Add the script tag’s
deferattribute. This used to only work in IE, but is now in Firefox 3.1. document.writeScript Tag – Write the<script src="">HTML into the page usingdocument.write. This only loads script without blocking in IE.
You can see an example of each technique using Cuzillion. It turns out that these techniques have several important differences, as shown in the following table. Most of them provide parallel downloads, although Script Defer and document.write Script Tag are mixed. Some of the techniques can’t be used on cross-site scripts, and some require slight modifications to your existing scripts to get them to work. An area of differentiation that’s not widely discussed is whether the technique triggers the browser’s busy indicators (status bar, progress bar, tab icon, and cursor). If you’re loading multiple scripts that depend on each other, you’ll need a technique that preserves execution order.
| Technique | Parallel Downloads | Domains can Differ | Existing Scripts | Busy Indicators | Ensures Order | Size (bytes) |
|---|---|---|---|---|---|---|
| XHR Eval | IE, FF, Saf, Chr, Op | no | no | Saf, Chr | - | ~500 |
| XHR Injection | IE, FF, Saf, Chr, Op | no | yes | Saf, Chr | - | ~500 |
| Script in Iframe | IE, FF, Saf, Chr, Op | no | no | IE, FF, Saf, Chr | - | ~50 |
| Script DOM Element | IE, FF, Saf, Chr, Op | yes | yes | FF, Saf, Chr | FF, Op | ~200 |
| Script Defer | IE, Saf4, Chr2, FF3.1 | yes | yes | IE, FF, Saf, Chr, Op | IE, FF, Saf, Chr, Op | ~50 |
| document.write Script Tag | IE, Saf4, Chr2, Op | yes | yes | IE, FF, Saf, Chr, Op | IE, FF, Saf, Chr, Op | ~100 |
The question is: Which is the best technique? The optimal technique depends on your situation. This decision tree should be used as a guide. It’s not as complex as it looks. Only three variables determine the outcome: is the script on the same domain as the main page, is it necessary to preserve execution order, and should the busy indicators be triggered.
Ideally, the logic in this decision tree would be encapsulated in popular HTML templating languages (PHP, Python, Perl, etc.) so that the web developer could just call a function and be assured that their script gets loaded using the optimal technique.
In many situations, the Script DOM Element is a good choice. It works in all browsers, doesn’t have any cross-site scripting restrictions, is fairly simple to implement, and is well understood. The one catch is that it doesn’t preserve execution order across all browsers. If you have multiple scripts that depend on each other, you’ll need to concatenate them or use a different technique. If you have an inline script that depends on the external script, you’ll need to synchronize them. I call this “coupling” and present several ways to do this in Coupling Asynchronous Scripts.
Performance Impact of CSS Selectors
A few months back there were some posts about the performance impact of inefficient CSS selectors. I was intrigued – this is the kind of browser idiosyncratic behavior that I live for. On further investigation, I’m not so sure that it’s worth the time to make CSS selectors more efficient. I’ll go even farther and say I don’t think anyone would notice if we woke up tomorrow and every web page’s CSS selectors were magically optimized.
The first post that caught my eye was about CSS Qualified Selectors by Shaun Inman. This post wasn’t actually about CSS performance, but in one of the comments David Hyatt (architect for Safari and WebKit, also worked on Mozilla, Camino, and Firefox) dropped this bomb:
The sad truth about CSS3 selectors is that they really shouldn’t be used at all if you care about page performance. Decorating your markup with classes and ids and matching purely on those while avoiding all uses of sibling, descendant and child selectors will actually make a page perform significantly better in all browsers.
Wow. Let me say that again. Wow.
The next posts were amazing. It was a series on Testing CSS Performance from Jon Sykes in three parts: part 1, part 2, and part 3. It’s fun to see how his tests evolve, so part 3 is really the one to read. This had me convinced that optimizing CSS selectors was a key step to fast pages.
But there were two things about the tests that troubled me. First, the large number of DOM elements and rules worried me. The pages contain 60,000 DOM elements and 20,000 CSS rules. This is an order of magnitude more than most pages. Pages this large make browsers behave in unusual ways (we’ll get back to that later). The table below has some stats from the top ten U.S. web sites for comparison.
| Web Site | # CSS Rules |
#DOM Elements |
| AOL | 2289 | 1628 |
| eBay | 305 | 588 |
| 2882 | 1966 | |
| 92 | 552 | |
| Live Search | 376 | 449 |
| MSN | 1038 | 886 |
| MySpace | 932 | 444 |
| Wikipedia | 795 | 1333 |
| Yahoo! | 800 | 564 |
| YouTube | 821 | 817 |
| average | 1033 | 923 |
The second thing that concerned me was how small the baseline test page was, compared to the more complex pages. The main question I want to answer is “do inefficient CSS selectors slow down pages?” All five test pages contain 20,000 anchor elements (nested inside P, DIV, DIV, DIV). What changes is their CSS: baseline (no CSS), tag selector (one rule for the A tag), 20,000 class selectors, 20,000 child selectors, and finally 20,000 descendant selectors. The last three pages top out at over 3 megabytes in size. But the baseline page and tag selector page, with little or no CSS, are only 1.8 megabytes. These pages answer the question “how much faster would my page be if I eliminated all CSS?” But not many of us are going to eliminate all CSS from our pages.
I revised the test as follows:
- 2000 anchors and 2000 rules (instead of 20,000) – this actually results in ~6000 DOM elements because of all the nesting in P, DIV, DIV, DIV
- the baseline page and tag selector page have 2000 rules just like all the other pages, but these are simple class rules that don’t match any classes in the page
I ran these tests on 12 browsers. Page render time was measured with a script block at the top and bottom of the page. (I loaded the page from local disk to avoid possible impact from chunked encoding.) The results are shown in the chart below. (I don’t show Opera 9.63 – it was way too slow – but you can download all the data as csv. You can also see the test pages.)

Performance varies across browsers; strangely, two new browsers, IE 8 and Firefox 3.1, are the slowest but comparisons should not be made from one browser to another. Although all the tests for a given browser were conducted on a single PC, different browsers might have been tested on different PCs with different performance characteristics. The goal of this experiment is not to compare browser performance – it’s to see how browsers handle progressively more complex CSS selectors.
[Revision: On further inspection comparing Firefox 3.0 and 3.1, I discovered that the test PC I used for testing Firefox 3.1 and IE 8 was slower than the other test PCs used in this experiment. I subsequently re-ran those tests as well as Firefox 3.0 and IE 7 on PCs that were more consistent and updated the chart above. Even with this re-run, because of possible differences in test hardware, do not use this data to compare one browser to another.]
Not surprisingly, the more complex pages (child selectors and descendant selectors) usually perform the worst. The biggest surprise is how small the delta is from the baseline to the most complex, worst performing test page. The average slowdown across all browsers is 50 ms, and if we look at the big ones (IE 6&7, FF3), the average delta is just 20 ms. For 70% or more of today’s users, improving these CSS selectors would only make a 20 ms improvement.
Keep in mind – these test pages are close to worst case. The 2000 anchors wrapped in P, DIV, DIV, DIV result in 6000 DOM elements – that’s twice as big as the max in the top ten sites. And the complex pages have 2000 extremely inefficient rules – a typical site has around one third of their rules that are complex child or descendant selectors. Facebook, for example, with the maximum number of rules at 2882 only has 750 that are these extremely inefficient rules.
Why do the results from my tests suggest something different from what’s been said lately? One difference comes from looking at things at such a large scale. It’s okay to exaggerate test cases if the results are proportional to common use cases. But in this case, browsers behave differently when confronted with a 3 megabyte page with 60,000 elements and 20,000 rules. I especially noticed that my results were much different for IE 6&7. I wondered if there was a hockey stick in how IE handled CSS selectors. To investigate this I loaded the child selector and descendant selector pages with increasing number of anchors and rules, from 1000 to 20,000. The results, shown in the chart below, reveal that IE hits a cliff around 18,000 rules. But when IE 6&7 work on a page that is closer to reality, as in my tests, they’re actually the fastest performers.

Based on these tests I have the following hypothesis: For most web sites, the possible performance gains from optimizing CSS selectors will be small, and are not worth the costs. There are some types of CSS rules and interactions with JavaScript that can make a page noticeably slower. This is where the focus should be. So I’m starting to collect real world examples of small CSS style-related issues (offsetWidth, :hover) that put the hurt on performance. If you have some, send them my way. I’m speaking at SXSW this weekend. If you’re there, and want to discuss CSS selectors, please find me. It’s important that we’re all focusing on the performance improvements that our users will really notice.
John Resig: Drop-in JavaScript Performance
I wrote a post on the Google Code Blog about John Resig’s tech talk “Drop-in JavaScript Performance.” The video and slides are now available.
In this talk, John starts off highlighting why performance will improve in the next generation of browsers, thanks to advances in JavaScript engines and new features such as process per tab and parallel script loading. He digs deeper into JavaScript performance, touching on shaping, tracing, just-in-time compilation, and the various benchmarks (SunSpider, Dromaeo, and V8 benchmark). John plugs my UA Profiler, with its tests for simultaneous connections, parallel script loading, and link prefetching. He wraps up with a collection of many other advanced features in the areas of communiction, DOM, styling, data, and measurements.

