Sharding Dominant Domains

May 12, 2009 7:05 pm | 13 Comments

This post is based on a chapter from Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.

Rule 9 from High Performance Web Sites says reducing DNS lookups makes web pages faster. This is true, but in some situations, it’s worthwhile to take a bunch of resources that are being downloaded on a single domain and split them across multiple domains. I call this domain sharding. Doing this allows more resources to be downloaded in parallel, reducing the overall page load time.

To determine if domain sharding makes sense, you have to see if the page has a dominant domain being used for downloading resources. The following HTTP waterfall chart, for Yahoo.com, is indicative of a site being slowed down by a dominant domain. This waterfall chart shows the page being downloaded in Internet Explorer 7, which only allows two parallel downloads per hostname. The vertical lines show that typically there are only two simultaneous downloads at any given time. Looking at the resource URLs, we see that almost all of them are served from “l.yimg.com.” Sharding these resources across two domains, such as “l1.yimg.com” and “l2.yimg.com”, would cause the resources to download in about half the time.

Most of the U.S. top ten web sites do domain sharding. YouTube uses i1.ytimg.com, i2.ytimg.com, i3.ytimg.com, and i4.ytimg.com. Live Search uses ts1.images.live.com, ts2.images.live.com, ts3.images.live.com, and ts4.images.live.com. Both of these sites are sharding across four domains. What’s the optimal number? Yahoo! released a study that recommends sharding across at least two, but no more than four, domains. Above four, performance actually degrades.

Not all browsers are restricted to just two parallel downloads per hostname. Opera 9+ and Safari 3+ do four downloads per hostname. Internet Explorer 8, Firefox 3, and Chrome 1+ do six downloads per hostname. Sharding across two domains is a good compromise that improves performance in all browsers.

13 Comments

Positioning Inline Scripts

May 6, 2009 10:26 pm | 14 Comments

My first three chapters in Even Faster Web Sites (Splitting the Initial Payload, Loading Scripts Without Blocking, and Coupling Asynchronous Scripts), focus on external scripts. But inline scripts block downloads and rendering just like external scripts do. The Inline Scripts Block example contains two images with an inline script between them. The inline script takes five seconds to execute. Looking at the HTTP waterfall chart, we see that the second image (which occurs after the inline script) is blocked from downloading until the inline script finishes executing.

Inline scripts block rendering, just like external scripts, but are worse. External scripts only block the rendering of elements below them in the page. Inline scripts block the rendering of everything in the page. You can see this by loading the Inline Scripts Block example in a new window and count off the five seconds before anything is displayed.

Here are three strategies for avoiding, or at least mitigating, the blocking behavior of inline scripts.

move inline scripts to the bottom of the page – Although this still blocks rendering, resources in the page aren’t blocked from downloading.
use setTimeout to kick off long executing code – If you have a function that takes a long time to execute, launching it via setTimeout allows the page to render and resources to download without being blocked.
use DEFER – In Internet Explorer and Firefox 3.1 or greater, adding the DEFER attribute to the inline SCRIPT tag avoids the blocking behavior of inline scripts.

It’s especially important to avoid placing inline scripts between a stylesheet and any other resources in the page. This will make it as if the stylesheet was blocking the resources that follow from downloading. The reason for this behavior is that all major browsers preserve the order of CSS and JavaScript. The stylesheet has to be fully downloaded, parsed, and applied before the inline script is executed. And the inline script must be executed before the remaining resources can be downloaded. Therefore, resources that follow a stylesheet and inline script are blocked from downloading.

It’s better to explain this with an example. Below is the HTTP waterfall chart for eBay.com. Normally, the stylesheets and the first script would be downloaded in parallel.¹ But after the stylesheet, there’s an inline script (with just one line of code). This causes the external script to be blocked from downloading until the stylesheet is received.

Stylesheet followed by inline script blocks downloads on eBay

This unnecessary blocking also happens on MSN, MySpace, and Wikipedia. The workaround is to move inline scripts above stylesheets or below other resources. This will increase parallel downloads resulting in a faster page.

¹All these resources are on the same domain, so downloading them all in parallel is possible on browsers that support more than two connections per server, including Internet Explorer 8, Firefox 3, Safari, Chrome, and Opera.

14 Comments

Web E^xponents

May 4, 2009 11:14 pm | 3 Comments

I’m fortunate to have gotten to know so many tech gurus over the last few years from speaking at conferences and co-chairing Velocity. As if programming wasn’t complex enough, there are all the choices that have to be made with regard to language, design, performance, scalability, tools, deployment, and more. It’s incredibly valuable to hear what these experts have to say. I always come away with new insights and ideas.

I wanted to find a way that my fellow Googlers and the web development community at large could share in this benefit, as well. A few months ago I started inviting these industry leaders to give tech talks at Google. This has gone very well. The talks are popular, and there’s an even bigger benefit from posting the videos and slides afterward.

I decided to give this speaker series a name: Web E^xponents where “exponent” is someone who “actively supports or favors a cause”. There’s a great tagline, too: “raising web technology to a higher power”. I blogged about this on the Google Code Blog:

Web E^xponents – announcing the speaker series
John Resig – Drop-in JavaScript Performance
Doug Crockford – JavaScript: The Good Parts
me – Life’s Too Short, Write Fast Code (part 2)
PPK – The Open Web Goes Mobile

All the videos are available in the Web Exponents playlist on YouTube. This week I’ll be blogging about Rob Campbell’s talk on Firebug. You can subscribe to the playlist to make sure you catch his and all future videos from these Web E^xponents.

3 Comments

Loading Scripts Without Blocking

April 27, 2009 10:49 pm | 47 Comments

As more and more sites evolve into “Web 2.0” apps, the amount of JavaScript increases. This is a performance concern because scripts have a negative impact on page performance. Mainstream browsers (i.e., IE 6 and 7)Â block in two ways:

Resources in the page are blocked from downloading if they are below the script.
Elements are blocked from rendering if they are below the script.

The Scripts Block Downloads example demonstrates this. It contains two external scripts followed by an image, a stylesheet, and an iframe. The HTTP waterfall chart from loading this example in IE7 shows that the first script blocks all downloads, then the second script blocks all downloads, and finally the image, stylesheet, and iframe all download in parallel. Watching the page render, you’ll notice that the paragraph of text above the script renders immediately. However, the rest of the text in the HTML document is blocked from rendering until all the scripts are done loading.

Scripts block downloads in IE6&7, Firefox 2&3.0, Safari 3, ChromeÂ 1, and Opera

Browsers are single threaded, so it’s understandable that while a script is executing the browser is unable to start other downloads. But there’s no reason that while the script is downloading the browser can’t start downloading other resources. And that’s exactly what newer browsers, including Internet Explorer 8, Safari 4, and Chrome 2, have done. The HTTP waterfall chart for the Scripts Block Downloads example in IE8 shows the scripts do indeed download in parallel, and the stylesheet is included in that parallel download. But the image and iframe are still blocked. Safari 4 and Chrome 2 behave in a similar way. Parallel downloading improves, but is still not as much as it could be.

Scripts still block, even in IE8, Safari 4, and ChromeÂ 2

Fortunately, there are ways to get scripts to download without blocking any other resources in the page, even in older browsers. Unfortunately, it’s up to the web developer to do the heavy lifting.

There are six main techniques for downloading scripts without blocking:

XHR Eval – Download the script via XHR and eval() the responseText.
XHR Injection – Download the script via XHR and inject it into the page by creating a script element and setting its text property to the responseText.
Script in Iframe – Wrap your script in an HTML page and download it as an iframe.
Script DOM Element – Create a script element and set its src property to the script’s URL.
Script Defer – Add the script tag’s defer attribute. This used to only work in IE, but is now in Firefox 3.1.
document.write Script Tag – Write the <script src=""> HTML into the page using document.write. This only loads script without blocking in IE.

You can see an example of each technique using Cuzillion. It turns out that these techniques have several important differences, as shown in the following table. Most of them provide parallel downloads, although Script Defer and document.write Script Tag are mixed. Some of the techniques can’t be used on cross-site scripts, and some require slight modifications to your existing scripts to get them to work. An area of differentiation that’s not widely discussed is whether the technique triggers the browser’s busy indicators (status bar, progress bar, tab icon, and cursor). If you’re loading multiple scripts that depend on each other, you’ll need a technique that preserves execution order.

Technique	Parallel Downloads	Domains can Differ	Existing Scripts	Busy Indicators	Ensures Order	Size (bytes)
XHR Eval	IE, FF, Saf, Chr, Op	no	no	Saf, Chr	–	~500
XHR Injection	IE, FF, Saf, Chr, Op	no	yes	Saf, Chr	–	~500
Script in Iframe	IE, FF, Saf, Chr, Op	no	no	IE, FF, Saf, Chr	–	~50
Script DOM Element	IE, FF, Saf, Chr, Op	yes	yes	FF, Saf, Chr	FF, Op	~200
Script Defer	IE, Saf4, Chr2, FF3.1	yes	yes	IE, FF, Saf, Chr, Op	IE, FF, Saf, Chr, Op	~50
document.write Script Tag	IE, Saf4, Chr2, Op	yes	yes	IE, FF, Saf, Chr, Op	IE, FF, Saf, Chr, Op	~100

The question is: Which is the best technique? The optimal technique depends on your situation. This decision tree should be used as a guide. It’s not as complex as it looks. Only three variables determine the outcome: is the script on the same domain as the main page, is it necessary to preserve execution order, and should the busy indicators be triggered.

Ideally, the logic in this decision tree would be encapsulated in popular HTML templating languages (PHP, Python, Perl, etc.) so that the web developer could just call a function and be assured that their script gets loaded using the optimal technique.

In many situations, the Script DOM Element is a good choice. It works in all browsers, doesn’t have any cross-site scripting restrictions, is fairly simple to implement, and is well understood. The one catch is that it doesn’t preserve execution order across all browsers. If you have multiple scripts that depend on each other, you’ll need to concatenate them or use a different technique. If you have an inline script that depends on the external script, you’ll need to synchronize them. I call this “coupling” and present several ways to do this in Coupling Asynchronous Scripts.

47 Comments

Even Faster Web Sites

April 23, 2009 12:23 am | 12 Comments

This post introduces Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.

Last April, I blogged about starting a follow-up to my first book, High Performance Web Sites. Last week I sent in the first round of final edits. Although there will likely be one or two more rounds of edits, they should be small. So, I’m feeling pretty much done. It’s a huge weight off my shoulders. I’ve been working on this book for more than a year. The performance best practices I present required more research than HPWS. I also expanded my testing from just IE and Firefox (as I did in HPWS) to IE, Firefox, Safari, Chrome, and Opera (including multiple versions of each).

The title of this new book is Even Faster Web Sites. It will be published in June, and is available for pre-order now on Amazon and O’Reilly. The cover of HPWS was a greyhound. EFWS’ cover is the Blackbuck Antelope – it can hit 50 mph which is in the top 5 for land animals. (Fastest is cheetah, but that’s taken by Programming the Perl DBI.)

The most exciting thing about EFWS is that it includes six chapters from contributing authors. This came about because I wanted to have best practices for JavaScript performance. I’m a pretty good JavaScript programmer, but not nearly as good as the JavaScript luminaries out there who are writing books and teaching workshops. I also wanted a chapter on image optimization, where Stoyan Stefanov and Nicole Sullivan are the experts. I reached out to folks in these and other areas to contribute performance best practices that they had accumulated. The resulting chapters are listed below. I’ve indicated the contributing authors where appropriate; otherwise, the chapter is written by me.

Understanding Ajax Performance – Doug Crockford
Creating Responsive Web Applications – Ben Galbraith and Dion Almaer
Splitting the Initial Payload
Loading Scripts Without Blocking
Coupling Asynchronous Scripts
Positioning Inline Scripts
Writing Efficient JavaScript – Nicholas C. Zakas
Scaling with Comet – Dylan Schiemann
Going Beyond Gzipping – Tony Gentilcore
Optimizing Images – Stoyan Stefanov and Nicole Sullivan
Sharding Dominant Domains
Flushing the Document Early
Using Iframes Sparingly
Simplifying CSS Selectors
Performance Tools

Between now and when the book comes out, I’ll write a blog post about each of my chapters. I wrote the first one of these, Split the Initial Payload, back in May. Now that I have more time on my hands, I’ll catch up and finish the rest.

If you’re just beginning the process of improving your web site’s performance, you should start with High Performance Web Sites. But as Web 2.0 gains wider adoption and the amount of content on web pages continues to grow, the best practices in Even Faster Web Sites are key to making today’s web sites fast(er).

12 Comments

don’t use @import

April 9, 2009 12:32 am | 90 Comments

In Chapter 5 of High Performance Web Sites, I briefly mention that @import has a negative impact on web page performance. I dug into this deeper for my talk at Web 2.0 Expo, creating several test pages and HTTP waterfall charts, all shown below. The bottomline is: use LINK instead of @import if you want stylesheets to download in parallel resulting in a faster page.

LINK vs. @import

There are two ways to include a stylesheet in your web page. You can use the LINK tag:

<link rel='stylesheet' href='a.css'>

Or you can use the @import rule:

<style>
@import url('a.css');
</style>

I prefer using LINK for simplicityâ€”you have to remember to put @import at the top of the style block or else it won’t work. It turns out that avoiding @import is better for performance, too.

@import @import

I’m going to walk through the different ways LINK and @import can be used. In these examples, there are two stylesheets: a.css and b.css. Each stylesheet is configured to take two seconds to download to make it easier to see the performance impact. The first example uses @import to pull in these two stylesheets. In this example, called @import @import, the HTML document contains the following style block:

<style>
@import url('a.css');
@import url('b.css');
</style>

If you always use @import in this way, there are no performance problems, although we’ll see below it could result in JavaScript errors due to race conditions. The two stylesheets are downloaded in parallel, as shown in Figure 1. (The first tiny request is the HTML document.) The problems arise when @import is embedded in other stylesheets or is used in combination with LINK.

Figure 1. always using @import is okay

LINK @import

The LINK @import example uses LINK for a.css, and @import for b.css:

<link rel='stylesheet' type='text/css' href='a.css'>
<style>
@import url('b.css');
</style>

In IE (tested on 6, 7, and 8), this causes the stylesheets to be downloaded sequentially, as shown in Figure 2. Downloading resources in parallel is key to a faster page. As shown here, this behavior in IE causes the page to take a longer time to finish.

Figure 2. link mixed with @import breaks parallel downloads in IE

LINK with @import

In the LINK with @import example, a.css is inserted using LINK, and a.css has an @import rule to pull in b.css:

in the HTML document:

<link rel='stylesheet' type='text/css' href='a.css'>

in a.css:

@import url('b.css');

This pattern also prevents the stylesheets from loading in parallel, but this time it happens on all browsers. When we stop and think about it, we shouldn’t be too surprised. The browser has to download a.css and parse it. At that point, the browser sees the @import rule and starts to fetch b.css.

Figure 3. using @import from within a LINKed stylesheet breaks parallel downloads in all browsers

LINK blocks @import

A slight variation on the previous example with surprising results in IE: LINK is used for a.css and for a new stylesheet called proxy.css. proxy.css is configured to return immediately; it contains an @import rule for b.css.

in the HTML document:

<link rel='stylesheet' type='text/css' href='a.css'>
<link rel='stylesheet' type='text/css' href='proxy.css'>

in proxy.css:

@import url('b.css');

The results of this example in IE, LINK blocks @import, are shown in Figure 4. The first request is the HTML document. The second request is a.css (two seconds). The third (tiny) request is proxy.css. The fourth request is b.css (two seconds). Surprisingly, IE won’t start downloading b.css until a.css finishes. In all other browsers, this blocking issue doesn’t occur, resulting in a faster page as shown in Figure 5.

Figure 4. LINK blocks @import embedded in other stylesheets in IE

Figure 5. LINK doesn't block @import embedded stylesheets in browsers other than IE

many @imports

The many @imports example shows that using @import in IE causes resources to be downloaded in a different order than specified. This example has six stylesheets (each takes two seconds to download) followed by a script (a four second download).

<style>
@import url('a.css');
@import url('b.css');
@import url('c.css');
@import url('d.css');
@import url('e.css');
@import url('f.css');
</style>
<script src='one.js' type='text/javascript'></script>

Looking at Figure 6, the longest bar is the four second script. Even though it was listed last, it gets downloaded first in IE. If the script contains code that depends on the styles applied from the stylesheets (a la getElementsByClassName, etc.), then unexpected results may occur because the script is loaded before the stylesheets, despite the developer listing it last.

Figure 6. @import causes resources to be downloaded out-of-order in IE

LINK LINK

It’s simpler and safer to use LINK to pull in stylesheets:

<link rel='stylesheet' type='text/css' href='a.css'>
<link rel='stylesheet' type='text/css' href='b.css'>

Using LINK ensures that stylesheets will be downloaded in parallel across all browsers. The LINK LINK example demonstrates this, as shown in Figure 7. Using LINK also guarantees resources are downloaded in the order specified by the developer.

Figure 7. using link ensures parallel downloads across all browsers

These issues need to be addressed in IE. It’s especially bad that resources can end up getting downloaded in a different order. All browsers should implement a small lookahead when downloading stylesheets to extract any @import rules and start those downloads immediately. Until browsers make these changes, I recommend avoiding @import and instead using LINK for inserting stylesheets.

Update: April 10, 2009 1:07 PM

Based on questions from the comments, I added two more tests: LINK with @imports and Many LINKs. Each of these insert four stylesheets into the HTML document. LINK with @imports uses LINK to load proxy.css; proxy.css then uses @import to load the four stylesheets. Many LINKs has four LINK tags in the HTML document to pull in the four stylesheets (my recommended approach). The HTTP waterfall charts are shown in Figure 8 and Figure 9.

Figure 8. LINK with @imports

Figure 9. Many LINKs

Looking at LINK with @imports, the first problem is that the four stylesheets don’t start downloading until after proxy.css returns. This happens in all browsers. On the other hand, Many LINKs starts downloading the stylesheets immediately.

The second problem is that IE changes the download order. I added a 10 second script (the really long bar) at the very bottom of the page. In all other browsers, the @import stylesheets (from proxy.css) get downloaded first, and the script is last, exactly the order specified. In IE, however, the script gets inserted before the @import stylesheets, as shown by LINK with @imports in Figure 8. This causes the stylesheets to take longer to download since the long script is using up one of only two connections available in IE 6&7. Since IE won’t render anything in the page until all stylesheets are downloaded, using @import in this way causes the page to be blank for 12 seconds. Using LINK instead of @import preserves the load order, as shown by Many LINKs in Figure 9. Thus, the page renders in 4 seconds.

The load times of these resources are exaggerated to make it easy to see what’s happening. But for people with slow connections, especially those in some of the world’s emerging markets, these response times may not be that far from reality. The takeaways are:

Using @import within a stylesheet adds one more roundtrip to the overall download time of the page.
Using @import in IE causes the download order to be altered. This may cause stylesheets to take longer to download, which hinders progress rendering making the page feel slower.

90 Comments

SXSW slides

March 16, 2009 9:05 am | 7 Comments

I spoke at SXSW ’09 this past weekend. My session was called Even Faster Web Sites. This is also the title of my next book, so it’s my way of linking my talks with the book. But I realize now that some people might think all of my “Even Faster Web Sites” presentations might be the same material. They’re not! I try to bring out new material for every talk I give. As I finish chapters for the next book, I use that material in the next presentations I give. This talk incorporates five upcoming chapters:

Load scripts without blocking
Coupling asynchronous scripts
Don’t scatter inline scripts
Use iframes sparingly
Flush the document early

This is the first time I cover all five of these best practices. My session was packed (they stopped letting people in) and it got the highest ratings for that time slot, so I think the material is useful. Checkout the ppt slides or see them on Slideshare. (There’s a lot of animation and hidden slides in this deck which is only visible in the ppt slides).

My next talk is at Web 2.0 Expo on April 1 (no fooling) April 2 (turns out I was fooling), where I’ll present two new chapters about CSS selectors and worldwide issues with gzip. I hope to see you there.

7 Comments

Performance Impact of CSS Selectors

March 10, 2009 11:28 pm | 53 Comments

A few months back there were some posts about the performance impact of inefficient CSS selectors. I was intrigued – this is the kind of browser idiosyncratic behavior that I live for. On further investigation, I’m not so sure that it’s worth the time to make CSS selectors more efficient. I’ll go even farther and say I don’t think anyone would notice if we woke up tomorrow and every web page’s CSS selectors were magically optimized.

The first post that caught my eye was about CSS Qualified Selectors by Shaun Inman. This post wasn’t actually about CSS performance, but in one of the comments David Hyatt (architect for Safari and WebKit, also worked on Mozilla, Camino, and Firefox) dropped this bomb:

The sad truth about CSS3 selectors is that they really shouldnâ€™t be used at all if you care about page performance. Decorating your markup with classes and ids and matching purely on those while avoiding all uses of sibling, descendant and child selectors will actually make a page perform significantly better in all browsers.

Wow. Let me say that again. Wow.

The next posts were amazing. It was a series on Testing CSS Performance from Jon Sykes in three parts: part 1, part 2, and part 3. It’s fun to see how his tests evolve, so part 3 is really the one to read. This had me convinced that optimizing CSS selectors was a key step to fast pages.

But there were two things about the tests that troubled me. First, the large number of DOM elements and rules worried me. The pages contain 60,000 DOM elements and 20,000 CSS rules. This is an order of magnitude more than most pages. Pages this large make browsers behave in unusual ways (we’ll get back to that later). The table below has some stats from the top ten U.S. web sites for comparison.

Web Site	# CSS Rules	#DOM Elements
AOL	2289	1628
eBay	305	588
Facebook	2882	1966
Google	92	552
Live Search	376	449
MSN	1038	886
MySpace	932	444
Wikipedia	795	1333
Yahoo!	800	564
YouTube	821	817
average	1033	923

The second thing that concerned me was how small the baseline test page was, compared to the more complex pages. The main question I want to answer is “do inefficient CSS selectors slow down pages?” All five test pages contain 20,000 anchor elements (nested inside P, DIV, DIV, DIV). What changes is their CSS: baseline (no CSS), tag selector (one rule for the A tag), 20,000 class selectors, 20,000 child selectors, and finally 20,000 descendant selectors. The last three pages top out at over 3 megabytes in size. But the baseline page and tag selector page, with little or no CSS, are only 1.8 megabytes. These pages answer the question “how much faster would my page be if I eliminated all CSS?” But not many of us are going to eliminate all CSS from our pages.

I revised the test as follows:

2000 anchors and 2000 rules (instead of 20,000) – this actually results in ~6000 DOM elements because of all the nesting in P, DIV, DIV, DIV
the baseline page and tag selector page have 2000 rules just like all the other pages, but these are simple class rules that don’t match any classes in the page

I ran these tests on 12 browsers. Page render time was measured with a script block at the top and bottom of the page. (I loaded the page from local disk to avoid possible impact from chunked encoding.) The results are shown in the chart below. (I don’t show Opera 9.63 – it was way too slow – but you can download all the data as csv. You can also see the test pages.)

Performance varies across browsers; strangely, two new browsers, IE 8 and Firefox 3.1, are the slowest but comparisons should not be made from one browser to another. Although all the tests for a given browser were conducted on a single PC, different browsers might have been tested on different PCs with different performance characteristics. The goal of this experiment is not to compare browser performance – it’s to see how browsers handle progressively more complex CSS selectors.

[Revision: On further inspection comparing Firefox 3.0 and 3.1, I discovered that the test PC I used for testing Firefox 3.1 and IE 8 was slower than the other test PCs used in this experiment. I subsequently re-ran those tests as well as Firefox 3.0 and IE 7 on PCs that were more consistent and updated the chart above. Even with this re-run, because of possible differences in test hardware, do not use this data to compare one browser to another.]

Not surprisingly, the more complex pages (child selectors and descendant selectors) usually perform the worst. The biggest surprise is how small the delta is from the baseline to the most complex, worst performing test page. The average slowdown across all browsers is 50 ms, and if we look at the big ones (IE 6&7, FF3), the average delta is just 20 ms. For 70% or more of today’s users, improving these CSS selectors would only make a 20 ms improvement.

Keep in mind – these test pages are close to worst case. The 2000 anchors wrapped in P, DIV, DIV, DIV result in 6000 DOM elements – that’s twice as big as the max in the top ten sites. And the complex pages have 2000 extremely inefficient rules – a typical site has around one third of their rules that are complex child or descendant selectors. Facebook, for example, with the maximum number of rules at 2882 only has 750 that are these extremely inefficient rules.

Why do the results from my tests suggest something different from what’s been said lately? One difference comes from looking at things at such a large scale. It’s okay to exaggerate test cases if the results are proportional to common use cases. But in this case, browsers behave differently when confronted with a 3 megabyte page with 60,000 elements and 20,000 rules. I especially noticed that my results were much different for IE 6&7. I wondered if there was a hockey stick in how IE handled CSS selectors. To investigate this I loaded the child selector and descendant selector pages with increasing number of anchors and rules, from 1000 to 20,000. The results, shown in the chart below, reveal that IE hits a cliff around 18,000 rules. But when IE 6&7 work on a page that is closer to reality, as in my tests, they’re actually the fastest performers.

Based on these tests I have the following hypothesis: For most web sites, the possible performance gains from optimizing CSS selectors will be small, and are not worth the costs. There are some types of CSS rules and interactions with JavaScript that can make a page noticeably slower. This is where the focus should be. So I’m starting to collect real world examples of small CSS style-related issues (offsetWidth, :hover) that put the hurt on performance. If you have some, send them my way. I’m speaking at SXSW this weekend. If you’re there, and want to discuss CSS selectors, please find me. It’s important that we’re all focusing on the performance improvements that our users will really notice.

53 Comments

O’Reilly Master Class

March 3, 2009 12:11 pm | 2 Comments

O’Reilly, my publisher, has launched a new initiative to bring a deeper level of information and engagement around their titles and technology focus areas. I’m excited to be a part of this by leading a one day workshop on Creating Higher Performance Web Sites. The workshop (or “Master Class” as they call it) is March 30, 9am-5pm, at the Mission Bay Conference Center in SF. The cost is $600 ($550 if you register before March 15).

This is going to be an engaging and fact-filled day for developers who care about web performance. I’m going to go over the best practices from my first book, High Performance Web Sites, which are also captured in YSlow. But I’m also going to touch on the chapters from my next book including loading scripts without blocking, flushing the document early, and using iframes sparingly. I’m just wrapping up these chapters now, so these are new insights “hot off the presses”. Since it’s a workshop, O’Reilly wants a fair amount of audience involvement, so I’m working up a few exercises to give attendees experience analyzing web pages to find performance bottlenecks.

There are also workshops from Doug Crockford (JavaScript: The Good Parts), Scott Berkun (Leading and Managing Breakthrough Projects), and Jonathan Zdziarski (iPhone Forensics: Recovering Evidence, Personal Data, and Corporate Assets). These workshops come right before Web 2.0 Expo in SF, so it’s a great doubleheader. I hope to see you there.

2 Comments

Fronteers 2009

February 10, 2009 10:19 pm | 3 Comments

I’m psyched to be speaking at Fronteers in November – and not just because it’s one of the best conference names ever. And not just because it’s in Amsterdam – although that is a huge plus. The main reason I’m psyched is because I missed last year’s conference and regretted it. The feedback I got was that the speakers were great and so were the attendees. PPK is active in his advocacy for frontend engineering, and (from what I heard) that was apparent in the level of knowledge and participation shown throughout the talks.

Last year’s speakers included Stuart Langridge, Christian Heilmann, and Pete LePage (check out the links to their talks on YDN). PPK has announced myself and Nate Koechley as speakers for 2009, and some other web gurus I know have said they’re speaking there as well. It’s going to be another great set of speakers and sessions. I’m so glad that I’ll be there to experience it, and I hope you can make it, too.

3 Comments