Render first. JS second.
Let me start with the takeaway point:
The key to creating a fast user experience in today’s web sites is to render the page as quickly as possible. To achieve this JavaScript loading and execution has to be deferred.
I’m in the middle of several big projects so my blogging rate is down. But I got an email today about asynchronous JavaScript loading and execution. I started to type up my lengthy response and remembered one of those tips for being more productive: “type shorter emails – no one reads long emails anyway”. That just doesn’t resonate with me. I like typing long emails. I love going into the details. But, I agree that an email response that only a few people might read is not the best investment of time. So I’m writing up my response here.
It took me months to research and write the “Loading Scripts Without Blocking” chapter from Even Faster Web Sites. Months for a single chapter! I wasn’t the first person to do async script loading – I noticed it on MSN way before I started that chapter – but that work paid off. There has been more research on async script loading from folks like Google, Facebook and Meebo. Most JavaScript frameworks have async script loading features – two examples are YUI and LABjs. And 8 of today’s Alexa Top 10 US sites use advanced techniques to load scripts without blocking: Google, Facebook, Yahoo!, YouTube, Amazon, Twitter, Craigslist(!), and Bing. Yay!
The downside is – although web sites are doing a better job of downloading scripts without blocking, once those scripts arrive their execution still blocks the page from rendering. Getting the content in front of the user as quickly as possible is the goal. If asynchronous scripts arrive while the page is loading, the browser has to stop rendering in order to parse and execute those scripts. This is the biggest obstacle to creating a fast user experience. I don’t have scientific results that I can cite to substantiate this claim (that’s part of the big projects I’m working on). But anyone who disables JavaScript in their browser can attest that sites feel twice as fast.
My #1 goal right now is to figure out ways that web sites can defer all JavaScript execution until after the page has rendered. Achieving this goal is going to involve advances from multiple camps – changes to browsers, new web development techniques, and new pieces of infrastructure. I’ve been talking this up for a year or so. When I mention this idea these are the typical arguments I hear for why this won’t work:
In response to this argument I point to Opera’s Delayed Script Execution feature. I encourage you to turn it on, surf around, and try to find a site that breaks. Even sites like Gmail and Facebook work! I’m sure there are some sites that have problems (perhaps that’s why this feature is off by default). But if some sites do have problems, how many sites are we talking about? And what’s the severity of the problems? We definitely don’t want errors, rendering problems, or loss of ad revenue. Even though Opera has had this feature for over two years (!), I haven’t heard much discussion about it. Imagine what could happen if significant resources focused on this problem.
What are the next steps?
- Browsers should look at Opera’s behavior and implement the SCRIPT ASYNC and DEFER attributes.
- Developers should adopt asynchronous script loading techniques and avoid rendering the initial page view with JavaScript on the client.
- Third party snippet providers, most notably ads, need to move away from document.write.
newtwitter performance analysis
Among the exciting launches last week was newtwitter – the amazing revamp of the Twitter UI. I use Twitter a lot and was all over the new release, and so was stoked to see this tweet from Twitter developer Ben Cherry:

The new Twitter UI looks good, but how does it score when it comes to performance? I spent a few hours investigating. I always start with HTTP waterfall charts, typically generated by HttpWatch. I look at Firefox and IE because they’re the most popular browsers (and I use a lot of Firefox tools). Here’s the waterfall chart for Firefox 3.6.10:

I used to look at IE7 but its market share is dropping, so now I start with IE8. Here’s the waterfall chart for IE8:

From the waterfall charts I generate a summary to get an idea of the overall page size and potential for problems.
| Firefox 3.6.10 | IE8 | |
|---|---|---|
| main content rendered | ~4 secs | ~5 secs |
| window onload | ~2 secs | ~7 secs |
| network activity done | ~5 secs | ~7 secs |
| # requests | 53 | 52 |
| total bytes downloaded | 428 kB | 442 kB |
| JS bytes downloaded | 181 kB | 181 kB |
| CSS bytes downloaded | 21 kB | 21 kB |
I study the waterfall charts looking for problems, primarily focused on places where parallel downloading stops and where there are white gaps. Then I run Page Speed and YSlow for more recommendations. Overall newtwitter does well, scoring 90 on Page Speed and 86 on YSlow. Combining all of this investigation results in this list of performance suggestions:
- Script Loading – The most important thing to do for performance is get JavaScript out of the way. “Out of the way” has two parts:
- blocking downloads – Twitter is now using the new Google Analytics async snippet (yay!) so ga.js isn’t blocking. However, base.bundle.js is loaded using normal SCRIPT SRC, so it blocks. In newer browsers there will be some parallel downloads, but even IE9 will block images until the script is done downloading. In both Firefox and IE it’s clear this is causing a break in parallel downloads. phoenix.bundle.js and api.bundle.js are loaded using LABjs in conservative mode. In both Firefox and IE there’s a big block after these two scripts. It could be LABjs missing an opportunity, or it could be that the JavaScript is executing. But it’s worth investigating why all the resources lower in the page (~40 of them) don’t start downloading until after these scripts are fetched. It’d be better to get more parallelized downloads.
- blocked rendering – The main part of the page takes 4-5 seconds to render. In some cases this is because the browser won’t render anything until all the JavaScript is downloaded. However, in this page the bulk of the content is generated by JS. (I concluded this after seeing that many of the image resources are specified in JS.) Rather than download a bunch of scripts to dynamically create the DOM, it’d be better to do this on the server side as HTML as part of the main HTML document. This can be a lot of work, but the page will never be fast for users if they have to wait for JavaScript to draw the main content in the page. After the HTML is rendered, the scripts can be downloaded in the background to attach dynamic behavior to the page.
- scattered inline scripts – Fewer inline scripts are better. The main reason is that a stylesheet followed by an inline script blocks subsequent downloads. (See Positioning Inline Scripts.) In this page, phoenix.bundle.css is followed by the Google Analytics inline script. This will cause the resources below that point to be blocked – in this case the images. It’d be better to move the GA snippet to the SCRIPT tag right above the stylesheet.
- Optimize Images - This one image alone could be optimized to save over 40K (out of 52K): http://s.twimg.com/a/1285108869/phoenix/img/tweet-dogear.png.
- Expires header is missing - Strangely, some images are missing both the Expires and Cache-Control headers, for example, http://a2.twimg.com/profile_images/30575362/48_normal.png. My guess is this is an origin server push problem.
- Cache-Control header is missing – Scripts from Amazon S3 have an Expires header, but no Cache-Control header (eg http://a2.twimg.com/a/1285097693/javascripts/base.bundle.js). This isn’t terrible, but it’d be good to include Cache-Control: max-age. The reason is that Cache-Control: max-age is relative (“# of seconds from right now”) whereas Expires is absolute (“Wed, 21 Sep 2011 20:44:22 GMT”). If the client has a skewed clock the actual cache time could be different than expected. In reality this happens infrequently.
- redirect - http://www.twitter.com/ redirects to http://twitter.com/. I love using twitter.com instead of www.twitter.com, but for people who use “www.” it would be better not to redirect them. You can check your logs and see how often this happens. If it’s small (< 1% of unique users per day) then it’s not a big problem.
- two spinners – Could one of these spinners be eliminated: http://twitter.com/images/spinner.gif and http://twitter.com/phoenix/img/loader.gif ?
- mini before profile – In IE the “mini” images are downloaded before the normal “profile” images. I think the profile images are more important, and I wonder why the order in IE is different than Firefox.
- CSS cleanup – Page Speed reports that there are > 70 very inefficient CSS selectors.
- Minify - Minifying the HTML document would save ~5K. Not big, but it’s three roundtrips for people on slow connections.
Overall newtwitter is beautiful and fast. Most of the Top 10 web sites are using advanced techniques for loading JavaScript. That’s the key to making today’s web apps fast.
Twitter vs blogging
My rate of blogging has dropped dramatically since I saw how @dalmaer was able to get so much news out via Twitter. I was slow to adopt Twitter, but now I love it. If you follow my blog but aren’t following me on Twitter, you should start following me: @souders. Here are some of the important tweets I’ve made in the last few days:
- IE 9 Beta is now available in http://WebPagetest.org – let’s hear it for @patmeenan!
- IE blog about need realistic benchmarks http://bit.ly/a3WR1c; Mozilla announces “Kraken focuses on realistic workloads” http://bit.ly/9wMGYn
- VCs tend to back companies with fast web sites (via @joshuabixby) – http://bit.ly/aLXaIf
- Do you use <meta name=”viewport” content=”width=device-width”> in your mobile web app? You should (via @ppk). http://bit.ly/9IuhzT
- Browserscope shows IE9 beta’s network behavior is improved. I’m surprised scripts & images can’t download in parallel. http://bit.ly/dpileW
- IE6 is now available on http://WebPagetest.org for Dulles, VA. Awesome work by Pat Meenan.
- Great survey of how YSlow score relates to page load time from Yottaa: http://blog.yottaa.com/2010/09/how-important-is-my-yslow-score/
- be careful using Web Inspector for performance analysis; JS execution can cause network events to be mistimed; Speed Tracer has same problem
- Had to file a FF bug and forgot the URL. Then remembered this Browser Resources page from Browserscope: http://www.browserscope.org/browsers
- About to start the first meeting of the W3C Web Performance Working Group: http://www.w3.org/2010/webperf/
You get the idea. I won’t do this blog-retweeting again, but if you care about web performance I hope you’ll follow me on Twitter.
WebPagetest.org and Page Speed
Pat Meenan just blogged about Page Speed results now available in Webpagetest. This is a great step toward greater consistency in the world of web performance, something that benefits developers and ultimately benefits web users.
I’ve been spreading the meme about WPO – the industry emerging around web performance optimization. I contend that in the early evolution of a new technology industry it’s better to strive for standardization and worry about differentiation later once the technology movement is well established.
One area where WPO needs more standardization is performance analysis. There are numerous performance analysis tools available, including Page Speed, YSlow, AOL Pagetest, MSFast, VRTA, and neXpert. There’s some commonality across these tools, but the differences are what’s noticeable. To get a thorough performance scrubbing, web developers have no choice but to run multiple tools and sift through the different results trying to find the most important recommendations. It would be better for developers to have a more standard way of analyzing performance across environments.
The Page Speed SDK provides a path to achieve this. When it was released, I easily stood up a web page that produces a Page Speed performance analysis from HAR files. Today, Pat has integrated the same SDK into WebPagetest.org. With wider adoption of the Page Speed SDK, we’re moving to having a more consistent performance analysis regardless of what browser and development environment you work in.
