Render first. JS second.

September 30, 2010 11:10 am | 28 Comments

Let me start with the takeaway point:

The key to creating a fast user experience in today’s web sites is to render the page as quickly as possible. To achieve this JavaScript loading and execution has to be deferred.

I’m in the middle of several big projects so my blogging rate is down. But I got an email today about asynchronous JavaScript loading and execution. I started to type up my lengthy response and remembered one of those tips for being more productive: “type shorter emails – no one reads long emails anyway”. That just doesn’t resonate with me. I like typing long emails. I love going into the details. But, I agree that an email response that only a few people might read is not the best investment of time. So I’m writing up my response here.

It took me months to research and write the “Loading Scripts Without Blocking” chapter from Even Faster Web Sites. Months for a single chapter! I wasn’t the first person to do async script loading – I noticed it on MSN way before I started that chapter – but that work paid off. There has been more research on async script loading from folks like Google, Facebook and Meebo. Most JavaScript frameworks have async script loading features – two examples are YUI and LABjs. And 8 of today’s Alexa Top 10 US sites use advanced techniques to load scripts without blocking: Google, Facebook, Yahoo!, YouTube, Amazon, Twitter, Craigslist(!), and Bing. Yay!

The downside is – although web sites are doing a better job of downloading scripts without blocking, once those scripts arrive their execution still blocks the page from rendering. Getting the content in front of the user as quickly as possible is the goal. If asynchronous scripts arrive while the page is loading, the browser has to stop rendering in order to parse and execute those scripts. This is the biggest obstacle to creating a fast user experience. I don’t have scientific results that I can cite to substantiate this claim (that’s part of the big projects I’m working on). But anyone who disables JavaScript in their browser can attest that sites feel twice as fast.

My #1 goal right now is to figure out ways that web sites can defer all JavaScript execution until after the page has rendered. Achieving this goal is going to involve advances from multiple camps – changes to browsers, new web development techniques, and new pieces of infrastructure. I’ve been talking this up for a year or so. When I mention this idea these are the typical arguments I hear for why this won’t work:

JavaScript execution is too complex
People point out that: “JavaScript is a powerful language and developers use it in weird, unexpected ways. Eval and document.write create unexpected dependencies. Blanketedly delaying all JavaScript execution is going to break too many web sites.”

In response to this argument I point to Opera’s Delayed Script Execution feature. I encourage you to turn it on, surf around, and  try to find a site that breaks. Even sites like Gmail and Facebook work! I’m sure there are some sites that have problems (perhaps that’s why this feature is off by default). But if some sites do have problems, how many sites are we talking about? And what’s the severity of the problems? We definitely don’t want errors, rendering problems, or loss of ad revenue. Even though Opera has had this feature for over two years (!), I haven’t heard much discussion about it. Imagine what could happen if significant resources focused on this problem.

the page content is actually generated by JS execution
Yes, this happens too much IMO. The typical logic is: “We’re building an Ajax app so the code to create and manipulate DOM elements has to be written in JavaScript. We could also write that logic on the serverside (in C++, Java, PHP, Python, etc.), but then we’re maintaining two code bases. Instead, we download everything as JSON and create the DOM in JavaScript on the client – even for the initial page view.”
If this is the architecture for your web app, then it’s true – you’re going to have to download and execute (a boatload) of JavaScript before the user can see the page. “But it’s only for the first page view!” The first page view is the most important one. “People start our web app and then leave it running all day, so they only incur the initial page load once.” Really? Have you verified this? In my experience, users frequently close their tab or browser, or even reboot. “But then at least they have the scripts in their cache.” Even so, the scripts still have to be executed and they’re probably going to have to refetch the JSON responses that include data that could have changed.
Luckily, there’s a strong movement toward server-side JavaScript – see Doug Crockford’s Loopage talk and node.js. This would allow that JavaScript code to run on the server, render the DOM, and serve it as HTML to the browser so that it renders quickly without needing JavaScript. The scripts needed for subsequent dynamic Ajaxy behavior can be downloaded lazily after the page has rendered. I expect we’ll see more solutions to address this particular problem of serving the entire page as HTML, even parts that are rendered using JavaScript
doing this actually makes the page slower
The crux of this argument centers around how you define “slower”. The old school way of measuring performance is to look at how long it takes for the window’s load event to fire. (It’s so cool to call onload “old school”, but a whole series of blog posts on better ways to measure performance is called for.)
It is true that delaying script execution could result in larger onload times. Although delaying scripts until after onload is one solution to this problem, it’s more important to talk about performance yardsticks. These days I focus on how quickly the critical page content renders. Many of Amazon’s pages, for example, have a lot of content and consequently can have a large onload time, but they do an amazing job of getting the content above-the-fold to render quickly.
WebPagetest.org‘s video comparison capabilities help hammer this home. Take a look at this video of Techcrunch, Mashable, and a few other tech news sites loading. This forces you to look at it from the user’s perspective. It doesn’t really matter when the load event fired, what matters is when you can see the page. Reducing render time is more in tune with creating a faster user experience, and delaying JavaScript is key to reducing render time.

What are the next steps?

  • Browsers should look at Opera’s behavior and implement the SCRIPT ASYNC and DEFER attributes.
  • Developers should adopt asynchronous script loading techniques and avoid rendering the initial page view with JavaScript on the client.
  • Third party snippet providers, most notably ads, need to move away from document.write.
Rendering first and executing JavaScript second is the key to a faster Web.

28 Comments

newtwitter performance analysis

September 22, 2010 10:51 pm | 15 Comments

Among the exciting launches last week was newtwitter – the amazing revamp of the Twitter UI. I use Twitter a lot and was all over the new release, and so was stoked to see this tweet from Twitter developer Ben Cherry:

The new Twitter UI looks good, but how does it score when it comes to performance? I spent a few hours investigating. I always start with HTTP waterfall charts, typically generated by HttpWatch. I look at Firefox and IE because they’re the most popular browsers (and I use a lot of Firefox tools). Here’s the waterfall chart for Firefox 3.6.10:

I used to look at IE7 but its market share is dropping, so now I start with IE8. Here’s the waterfall chart for IE8:

From the waterfall charts I generate a summary to get an idea of the overall page size and potential for problems.

Firefox 3.6.10 IE8
main content rendered ~4 secs ~5 secs
window onload ~2 secs ~7 secs
network activity done ~5 secs ~7 secs
# requests 53 52
total bytes downloaded 428 kB 442 kB
JS bytes downloaded 181 kB 181 kB
CSS bytes downloaded 21 kB 21 kB

I study the waterfall charts looking for problems, primarily focused on places where parallel downloading stops and where there are white gaps. Then I run Page Speed and YSlow for more recommendations. Overall newtwitter does well, scoring 90 on Page Speed and 86 on YSlow. Combining all of this investigation results in this list of performance suggestions:

  1. Script Loading – The most important thing to do for performance is get JavaScript out of the way. “Out of the way” has two parts:
    • blocking downloads – Twitter is now using the new Google Analytics async snippet (yay!) so ga.js isn’t blocking. However, base.bundle.js is loaded using normal SCRIPT SRC, so it blocks. In newer browsers there will be some parallel downloads, but even IE9 will block images until the script is done downloading. In both Firefox and IE it’s clear this is causing a break in parallel downloads. phoenix.bundle.js and api.bundle.js are loaded using LABjs in conservative mode. In both Firefox and IE there’s a big block after these two scripts. It could be LABjs missing an opportunity, or it could be that the JavaScript is executing. But it’s worth investigating why all the resources lower in the page (~40 of them) don’t start downloading until after these scripts are fetched. It’d be better to get more parallelized downloads.
    • blocked rendering – The main part of the page takes 4-5 seconds to render. In some cases this is because the browser won’t render anything until all the JavaScript is downloaded. However, in this page the bulk of the content is generated by JS. (I concluded this after seeing that many of the image resources are specified in JS.) Rather than download a bunch of scripts to dynamically create the DOM, it’d be better to do this on the server side as HTML as part of the main HTML document. This can be a lot of work, but the page will never be fast for users if they have to wait for JavaScript to draw the main content in the page. After the HTML is rendered, the scripts can be downloaded in the background to attach dynamic behavior to the page.
  2. scattered inline scripts – Fewer inline scripts are better. The main reason is that a stylesheet followed by an inline script blocks subsequent downloads. (See Positioning Inline Scripts.) In this page, phoenix.bundle.css is followed by the Google Analytics inline script. This will cause the resources below that point to be blocked – in this case the images. It’d be better to move the GA snippet to the SCRIPT tag right above the stylesheet.
  3. Optimize Images - This one image alone could be optimized to save over 40K (out of 52K): http://s.twimg.com/a/1285108869/phoenix/img/tweet-dogear.png.
  4. Expires header is missing - Strangely, some images are missing both the Expires and Cache-Control headers, for example, http://a2.twimg.com/profile_images/30575362/48_normal.png. My guess is this is an origin server push problem.
  5. Cache-Control header is missing – Scripts from Amazon S3 have an Expires header, but no Cache-Control header (eg http://a2.twimg.com/a/1285097693/javascripts/base.bundle.js). This isn’t terrible, but it’d be good to include Cache-Control: max-age. The reason is that Cache-Control: max-age is relative (“# of seconds from right now”) whereas Expires is absolute (“Wed, 21 Sep 2011 20:44:22 GMT”). If the client has a skewed clock the actual cache time could be different than expected. In reality this happens infrequently.
  6. redirect - http://www.twitter.com/ redirects to http://twitter.com/. I love using twitter.com instead of www.twitter.com, but for people who use “www.” it would be better not to redirect them. You can check your logs and see how often this happens. If it’s small (< 1% of unique users per day) then it’s not a big problem.
  7. two spinners – Could one of these spinners be eliminated: http://twitter.com/images/spinner.gif and http://twitter.com/phoenix/img/loader.gif ?
  8. mini before profile – In IE the “mini” images are downloaded before the normal “profile” images. I think the profile images are more important, and I wonder why the order in IE is different than Firefox.
  9. CSS cleanup – Page Speed reports that there are > 70 very inefficient CSS selectors.
  10. Minify - Minifying the HTML document would save ~5K. Not big, but it’s three roundtrips for people on slow connections.

Overall newtwitter is beautiful and fast. Most of the Top 10 web sites are using advanced techniques for loading JavaScript. That’s the key to making today’s web apps fast.

15 Comments

Twitter vs blogging

September 15, 2010 2:21 pm | 8 Comments

My rate of blogging has dropped dramatically since I saw how @dalmaer was able to get so much news out via Twitter. I was slow to adopt Twitter, but now I love it. If you follow my blog but aren’t following me on Twitter, you should start following me: @souders. Here are some of the important tweets I’ve made in the last few days:

You get the idea. I won’t do this blog-retweeting again, but if you care about web performance I hope you’ll follow me on Twitter.

@souders

8 Comments

WebPagetest.org and Page Speed

September 7, 2010 9:36 am | 2 Comments

Pat Meenan just blogged about Page Speed results now available in Webpagetest. This is a great step toward greater consistency in the world of web performance, something that benefits developers and ultimately benefits web users.

I’ve been spreading the meme about WPO – the industry emerging around web performance optimization. I contend that in the early evolution of a new technology industry it’s better to strive for standardization and worry about differentiation later once the technology movement is well established.

One area where WPO needs more standardization is performance analysis. There are numerous performance analysis tools available, including Page Speed, YSlow, AOL Pagetest, MSFast, VRTA, and neXpert. There’s some commonality across these tools, but the differences are what’s noticeable. To get a thorough performance scrubbing, web developers have no choice but to run multiple tools and sift through the different results trying to find the most important recommendations. It would be better for developers to have a more standard way of analyzing performance across environments.

The Page Speed SDK provides a path to achieve this. When it was released, I easily stood up a web page that produces a Page Speed performance analysis from HAR files. Today, Pat has integrated the same SDK into WebPagetest.org. With wider adoption of the Page Speed SDK, we’re moving to having a more consistent performance analysis regardless of what browser and development environment you work in.

2 Comments