Performance Impact of CSS Selectors
A few months back there were some posts about the performance impact of inefficient CSS selectors. I was intrigued – this is the kind of browser idiosyncratic behavior that I live for. On further investigation, I’m not so sure that it’s worth the time to make CSS selectors more efficient. I’ll go even farther and say I don’t think anyone would notice if we woke up tomorrow and every web page’s CSS selectors were magically optimized.
The first post that caught my eye was about CSS Qualified Selectors by Shaun Inman. This post wasn’t actually about CSS performance, but in one of the comments David Hyatt (architect for Safari and WebKit, also worked on Mozilla, Camino, and Firefox) dropped this bomb:
The sad truth about CSS3 selectors is that they really shouldn’t be used at all if you care about page performance. Decorating your markup with classes and ids and matching purely on those while avoiding all uses of sibling, descendant and child selectors will actually make a page perform significantly better in all browsers.
Wow. Let me say that again. Wow.
The next posts were amazing. It was a series on Testing CSS Performance from Jon Sykes in three parts: part 1, part 2, and part 3. It’s fun to see how his tests evolve, so part 3 is really the one to read. This had me convinced that optimizing CSS selectors was a key step to fast pages.
But there were two things about the tests that troubled me. First, the large number of DOM elements and rules worried me. The pages contain 60,000 DOM elements and 20,000 CSS rules. This is an order of magnitude more than most pages. Pages this large make browsers behave in unusual ways (we’ll get back to that later). The table below has some stats from the top ten U.S. web sites for comparison.
|Web Site||# CSS Rules
The second thing that concerned me was how small the baseline test page was, compared to the more complex pages. The main question I want to answer is “do inefficient CSS selectors slow down pages?” All five test pages contain 20,000 anchor elements (nested inside P, DIV, DIV, DIV). What changes is their CSS: baseline (no CSS), tag selector (one rule for the A tag), 20,000 class selectors, 20,000 child selectors, and finally 20,000 descendant selectors. The last three pages top out at over 3 megabytes in size. But the baseline page and tag selector page, with little or no CSS, are only 1.8 megabytes. These pages answer the question “how much faster would my page be if I eliminated all CSS?” But not many of us are going to eliminate all CSS from our pages.
I revised the test as follows:
- 2000 anchors and 2000 rules (instead of 20,000) – this actually results in ~6000 DOM elements because of all the nesting in P, DIV, DIV, DIV
- the baseline page and tag selector page have 2000 rules just like all the other pages, but these are simple class rules that don’t match any classes in the page
I ran these tests on 12 browsers. Page render time was measured with a script block at the top and bottom of the page. (I loaded the page from local disk to avoid possible impact from chunked encoding.) The results are shown in the chart below. (I don’t show Opera 9.63 – it was way too slow – but you can download all the data as csv. You can also see the test pages.)
Performance varies across browsers; strangely, two new browsers, IE 8 and Firefox 3.1, are the slowest but comparisons should not be made from one browser to another. Although all the tests for a given browser were conducted on a single PC, different browsers might have been tested on different PCs with different performance characteristics. The goal of this experiment is not to compare browser performance – it’s to see how browsers handle progressively more complex CSS selectors.
[Revision: On further inspection comparing Firefox 3.0 and 3.1, I discovered that the test PC I used for testing Firefox 3.1 and IE 8 was slower than the other test PCs used in this experiment. I subsequently re-ran those tests as well as Firefox 3.0 and IE 7 on PCs that were more consistent and updated the chart above. Even with this re-run, because of possible differences in test hardware, do not use this data to compare one browser to another.]
Not surprisingly, the more complex pages (child selectors and descendant selectors) usually perform the worst. The biggest surprise is how small the delta is from the baseline to the most complex, worst performing test page. The average slowdown across all browsers is 50 ms, and if we look at the big ones (IE 6&7, FF3), the average delta is just 20 ms. For 70% or more of today’s users, improving these CSS selectors would only make a 20 ms improvement.
Keep in mind – these test pages are close to worst case. The 2000 anchors wrapped in P, DIV, DIV, DIV result in 6000 DOM elements – that’s twice as big as the max in the top ten sites. And the complex pages have 2000 extremely inefficient rules – a typical site has around one third of their rules that are complex child or descendant selectors. Facebook, for example, with the maximum number of rules at 2882 only has 750 that are these extremely inefficient rules.
Why do the results from my tests suggest something different from what’s been said lately? One difference comes from looking at things at such a large scale. It’s okay to exaggerate test cases if the results are proportional to common use cases. But in this case, browsers behave differently when confronted with a 3 megabyte page with 60,000 elements and 20,000 rules. I especially noticed that my results were much different for IE 6&7. I wondered if there was a hockey stick in how IE handled CSS selectors. To investigate this I loaded the child selector and descendant selector pages with increasing number of anchors and rules, from 1000 to 20,000. The results, shown in the chart below, reveal that IE hits a cliff around 18,000 rules. But when IE 6&7 work on a page that is closer to reality, as in my tests, they’re actually the fastest performers.