ControlJS part 3: overriding document.write

December 15, 2010 11:18 pm | 20 Comments

This is the third of three blog posts about ControlJS – a JavaScript module for making scripts load faster. The three blog posts describe how ControlJS is used for async loading, delayed execution, and overriding document.write.

The goal of ControlJS is to give developers more control over how JavaScript is loaded. The key insight is to recognize that “loading” has two phases: download (fetching the bytes) and execution (including parsing). In ControlJS part 1 I focused on the download phase using ControlJS to download scripts asynchronously. ControlJS part 2 showed how delaying script execution makes pages load faster, especially for Ajax web apps that download a lot of JavaScript that’s not used immediately. In this post I look at how document.write causes issues when loading scripts asynchronously.

I like ControlJS, but I’ll admit now that I was unable to solve the document.write problem to my satisfaction. Nevertheless, I describe the dent I put in the problem of document.write and point out two alternatives that do a wonderful job of making async work in spite of document.write.

Async and document.write

ControlJS loads scripts asynchronously. By default these async scripts are executed after window onload. If one of these async scripts calls document.write the page is erased. That’s because calling document.write after the document is loaded automatically calls document.open. Anything written to the open document replaces the existing document content. Obviously, this is not the desired behavior.

I wanted ControlJS to work on all scripts in the page. Scripts for ads are notorious for using document.write, but I also found scripts created by website owners that use document.write. I didn’t want those scripts to be inadvertently loaded with ControlJS and have the page get wiped out.

The problem was made slightly easier for me because ControlJS is in control when each inline and external script is being executed. My solution is to override document.write and capture the output for each script one at a time. If there’s no document.write output then proceed to the next script. If there is output then I create an element (for now a SPAN), insert it into the DOM immediately before the SCRIPT tag that’s being executed, and set the SPAN’s innerHTML to the output from document.write.

This works pretty well. I downloaded a local copy of CNN.com and converted it to ControlJS. There were two places where document.write was used (including an ad) and both worked fine. But there are issues. One issue I encountered is that wrapping the output in a SPAN can break certain behavior like hover handlers. But a bigger issue is that browsers ignore “<script…” added via innerHTML, and one of the main ways that ads use document.write is to insert scripts. I have a weak solution to this that parses out the “<script…” HTML and creates a SCRIPT DOM element with the appropriate SRC. It works, but is not very robust.

ControlJS is still valuable even without a robust answer for document.write. You don’t have to convert every script in the page to use ControlJS. If you know an ad or widget uses document.write you can load those scripts the normal way. Or you can test and see how complex the document.write is – perhaps it’s a use cases that ControlJS handles.

Real solutions for async document.write

I can’t end this series of posts without highlighting two alternatives that have solved the asynchronous document.write issue: Aptimize and GhostWriter.

Aptimize sells a web accelerator that changes HTML markup on the fly to have better performance. In September they announced the capability to load scripts asynchronously – including ads. As far as I know this is the first product that offered this feature. I had lunch with Ed and Derek from Aptimize this week and found out more about their implementation. It sounds solid.

Another solution is GhostWriter from Digital Fulcrum. Will contacted me and we had a concall and a few email exchanges about his implementation that uses an HTML parser on the document.write output. You can try the code for free.

Wrapping it up

ControlJS has many features including some that aren’t found in other script loading modules:

  • downloads scripts asynchronously
  • handles both inline scripts and external scripts
  • delays script execution until after the page has rendered
  • allows for scripts to be downloaded and not executed
  • integrates with simple changes to HTML – no code changes
  • solves some document.write async use cases
  • control.js itself is loaded asynchronously

ControlJS is open source. I hope some of these features are borrowed and added to other script loaders. There’s still more features to implement (async stylesheets, better document.write handling) and ControlJS has had minimal testing. If you’d like to add features or fix bugs please take a look at the ControlJS project on Google Code and contact the group via the ControlJS discussion list.

20 Comments

ControlJS part 2: delayed execution

December 15, 2010 11:11 pm | 15 Comments

This is the second of three blog posts about ControlJS – a JavaScript module for making scripts load faster. The three blog posts describe how ControlJS is used for async loading, delayed execution, and overriding document.write.

The goal of ControlJS is to give developers more control over how JavaScript is loaded. The key insight is to recognize that “loading” has two phases: download (fetching the bytes) and execution (including parsing). In ControlJS part 1 I focused on the download phase using ControlJS to download scripts asynchronously. In this post I focus on the benefits ControlJS brings to the execution phase of script loading.

The issue with execution

The issue with the execution phase is that while the browser is parsing and executing JavaScript it can’t do much else. From a performance perspective this means the browser UI is unresponsive, page rendering stops, and the browser won’t start any new downloads.

The execution phase can take more time than you might think. I don’t have hard numbers on how much time is spent in this phase (although it wouldn’t be that hard to collect this data with dynaTrace Ajax Edition or Speed Tracer). If you do anything computationally intensive with a lot of DOM interaction the blocking effects can be noticeable. If you have a large amount of JavaScript just parsing the code can take a significant amount of time.

If all the JavaScript was used immediately to render the page we might have to throw up our hands and endure these delays, but it turns out that a lot of the JavaScript downloaded isn’t used right away. Across the Alexa US Top 10 only 29% of the downloaded JavaScript is called before the window load event. (I surveyed this using Page Speed‘s “defer JavaScript” feature.) The other 71% of the code is parsed and executed even though it’s not used for rendering the initial page. For these pages an average of 229 kB of JavaScript is downloaded. That 229 kB is compressed – the uncompressed size is upwards of 500 kB. That’s a lot of JavaScript. Rather than parse and execute 71% of the JavaScript in the middle of page loading, it would be better to avoid that pain until after the page is done rendering. But how can a developer do that?

Loading feature code

To ease our discussion let’s call that 29% of code used to render the page immediate-code and we’ll call the other 71% feature-code. The feature-code is typically for DHTML and Ajax features such as drop down menus, popup dialogs, and friend requests where the code is only executed if the user exercises the feature. (That’s why it’s not showing up in the list of functions called before window onload.)

Let’s assume you’ve successfully split your JavaScript into immediate-code and feature-code. The immediate-code scripts are loaded as part of the initial page loading process using ControlJS’s async loading capabilities. The additional scripts that contain the feature-code could also be loaded this way during the initial page load process, but then the browser is going to lock up parsing and executing code that’s not immediately needed. So we don’t want to load the feature-code during the initial page loading process.

Another approach would be to load the feature-code after the initial page is fully rendered. The scripts could be loaded in the background as part of an onload handler. But even though the feature-code scripts are downloaded in the background, the browser still locks up when it parses and executes them. If the user tries to interact with the page the UI is unresponsive. This unnecessary delay is painful enough that the Gmail mobile team went out of their way to avoid it. To make matters worse, the user has no idea why this is happening since they didn’t do anything to cause the lock up. (We kicked it off “in the background”.)

The solution is to parse and execute this feature-code when the user needs it. For example, when the user first clicks on the drop down menu is when we parse and execute menu.js. If the user never uses the drop down menu, then we avoid the cost of parsing and executing that code altogether. But when the user clicks on that menu we don’t want her to wait for the bytes to be downloaded – that would take too long especially on mobile. The best solution is to download the script in the background but delay execution until later when the code is actually needed.

Download now, Execute later

After that lengthy setup we’re ready to look at the delayed execution capabilities of ControlJS. I created an example that has a drop down menu containing the links to the examples.

In order to illustrate the pain from lengthy script execution I added a 2 second delay to the menu code (a while loop that doesn’t relinquish until 2 seconds have passed). Menu withOUT ControlJS is the baseline case. The page takes a long time to render because it’s blocked while the scripts are downloaded and also during the 2 seconds of script execution.

On the other hand, Menu WITH ControlJS renders much more quickly. The scripts are downloaded asynchronously. Since these scripts are for an optional feature we want to avoid executing them until later. This is achieved by using the “data-cjsexec=false” attribute, in addition to the other ControlJS modifications to the SCRIPT’s TYPE and SRC attributes:

<script data-cjsexec=false type="text/cjs" data-cjssrc="jquery.min.js"></script>
<script data-cjsexec=false type="text/cjs" data-cjssrc="fg.menu.js"></script>

The “data-cjsexec=false” setting means that the scripts are downloaded and stored in the cache, but they’re not executed. The execution is triggered later if and when the user exercises the feature. In this case, clicking on the Examples button is the trigger:

examplesbtn.onclick = function() {
   CJS.execScript("jquery.min.js");
   CJS.execScript("fg.menu.js", createExamplesMenu);
};

CJS.execScript() creates a SCRIPT element with the specified SRC and inserts it into the DOM. The menu creation function, createExamplesMenu, is passed in as the onload callback function for the last script. The 2 second delay is therefore incurred the first time the user clicks on the menu, but it’s tied to a the user’s action, there’s no delay to download the script, and the execution delay only occurs once.

The ability to separate and control the download phase and the execution phase during script loading is a key differentiator of ControlJS. Many websites won’t need this option. But websites that have a lot of code that’s not used to render the initial page, such as Ajax web apps, will benefit from avoiding the pain of script parsing and execution until it’s absolutely necessary.

15 Comments

ControlJS part 1: async loading

December 15, 2010 10:45 pm | 30 Comments

This is the first of three blog posts about ControlJS – a JavaScript module for making scripts load faster. The three blog posts describe how ControlJS is used for async loading, delayed execution, and overriding document.write.

The #1 performance best practice I’ve been evangelizing over the past year is progressive enhancement: deliver the page as HTML so it renders quickly, then enhance it with JavaScript. There are too many pages that are blank while several hundred kB of JavaScript is downloaded, parsed, and executed so that the page can be created using the DOM.

It’s easy to evangelize progressive enhancement – it’s much harder to actually implement it. That several hundred kB of JavaScript has to be unraveled and reorganized. All the logic that created DOM elements in the browser via JavaScript has to be migrated to run on the server to generate HTML. Even with new server-side JavaScript capabilities this is a major redesign effort.

I keep going back to Opera’s Delayed Script Execution option. Just by enabling this configuration option JavaScript is moved aside so that page rendering can come first. And the feature works – I can’t find a single website the suffers any errors with this turned on. While I continue to encourage other browser vendors to implement this feature, I want a way for developers to get this behavior sooner rather than later.

A number of web accelerators (like Aptimize, Strangeloop, FastSoft, CloudFlare, Torbit, and more recently mod_pagespeed) have emerged over the last year or so. They modify the HTML markup to inject performance best practices into the page. Thinking about this model I considered ways that markup could be changed to more easily delay the impact of JavaScript on progressive rendering.

The result is a JavaScript module I call ControlJS.

Controlling download and execution

The goal of ControlJS is to give developers more control over how JavaScript is loaded. The key insight is to recognize that “loading” has two phases: download (fetching the bytes) and execution (including parsing). These two phases need to be separated and controlled for best results.

Controlling how scripts are downloaded is a popular practice among performance-minded developers. The issue is that when scripts are loaded the normal way (<script src=""...) they block other resource downloads (lesser so in newer browsers) and also block page rendering (in all browsers). Using asynchronous script loading techniques mitigates these issues to a large degree. I wrote about several asynchronous loading techniques in Even Faster Web Sites. LABjs and HeadJS are JavaScript modules that provide wrappers for async loading. While these async techniques address the blocking issues that occur during the script download phase, they don’t address what happens during the script execution phase.

Page rendering and new resource downloads are blocked during script execution. With the amount of JavaScript on today’s web pages ever increasing, the blocking that occurs during script execution can be significant, especially for mobile devices. In fact, the Gmail mobile team thought script execution was such a problem they implemented a new async technique that downloads JavaScript wrapped in comments. This allows them to separate the download phase from the execution phase. The JavaScript is downloaded so that it resides on the client (in the browser cache) but since it’s a comment there’s no blocking from execution. Later, when the user invokes the associated features, the JavaScript is executed by removing the comment and evaluating the code.

Stoyan Stefanov, a former teammate and awesome performance wrangler, recently blogged about preloading JavaScript without execution. His approach is to download the script as either an IMAGE or an OBJECT element (depending on the browser). Assuming the proper caching headers exist the response is cached on the client and can later be inserted as a SCRIPT element. This is the technique used in ControlJS.

ControlJS: how to use it

To use ControlJS you need to make three modifications to your page.

Modification #1: add control.js

I think it’s ironic that JavaScript modules for loading scripts asynchronously have to be loaded in a blocking manner. From the beginning I wanted to make sure that ControlJS itself could be loaded asynchronously. Here’s the snippet for doing that:

var cjsscript = document.createElement('script');
cjsscript.src = "control.js";
var cjssib = document.getElementsByTagName('script')[0];
cjssib.parentNode.insertBefore(cjsscript, cjssib);

Modification #2: change external scripts

The next step is to transform all of the old style external scripts to load the ControlJS way. The normal style looks like this:

<script type="text/javascript" src="main.js"><script>

The SCRIPT element’s TYPE attribute needs to be changed to “text/cjs” and the SRC attribute needs to be changed to DATA-CJSSRC, like this:

<script type="text/cjs" data-cjssrc="main.js"><script>

Modification #3: change inline scripts

Most pages have inline scripts in addition to external scripts. These scripts have dependencies: inline scripts might depend on external scripts for certain symbols, and vice versa. It’s important that the execution order of the inline scripts and external scripts is preserved. (This is a feature that some of the async wrappers overlook.)  Therefore, inline scripts must also be converted by changing the TYPE attribute from “text/javascript” in the normal syntax:

<script type="text/javascript">
var name = getName();
<script>

to “text/cjs”, like this:

<script type="text/cjs">
var name = getName();
<script>

That’s it! ControlJS takes care of the rest.

ControlJS: how it works

Your existing SCRIPTs no longer block the page because the TYPE attribute has been changed to something the browser doesn’t recognize. This allows ControlJS to take control and load the scripts in a more high performance way. Let’s take a high-level look at how ControlJS works. You can also view the control.js script for more details.

We want to start downloading the scripts as soon as possible. Since we’re downloading them as an IMAGE or OBJECT they won’t block the page during the download phase. And since they’re not being downloaded as a SCRIPT they won’t be executed.  ControlJS starts by finding all the SCRIPT elements that have the “text/cjs” type. If the script has a DATA-CJSSRC attribute then an IMAGE (for IE and Opera) or OBJECT (for all other browsers) is created dynamically with the appropriate URL. (See Stoyan’s post for the full details.)

By default ControlJS waits until the window load event before it begins the execution phase. (It’s also possible to have it start executing scripts immediately or once the DOM is loaded.) ControlJS iterates over its scripts a second time, doing so in the order they appear in the page. If the script is an inline script the code is extracted and evaluated. If the script is an external script and its IMAGE or OBJECT download has completed then it’s inserted in the page as a SCRIPT element so that the code is parsed and executed. If the IMAGE or OBJECT download hasn’t finished then it reenters the iteration after a short timeout and tries again.

There’s more functionality I’ll talk about in later posts regarding document.write and skipping the execution step. For now, let’s look at a simple async loading example.

Async example

To show ControlJS’s async loading capabilities I’ve created an example that contains three scripts in the HEAD:

  • main.js – takes 4 seconds to download
  • an inline script that references symbols from main.js
  • page.js – takes 2 seconds to download and references symbols from the inline script

I made page.js take less time than main.js to make sure that the scripts are executed in the correct order (even though page.js downloads more quickly). I include the inline script because this is a pattern I see frequently (e.g., Google Analytics) but many script loader helpers don’t support inline scripts as part of the execution order.

Async withOUT ControlJS is the baseline example. It loads the scripts in the normal way. The HTTP waterfall chart generated in IE8 (using HttpWatch) is shown in Figure 1. IE8 is better than IE 6&7 – it loads scripts in parallel so main.js and page.js are downloaded together. But all versions of IE block image downloads until scripts are done, so images 1-4 get pushed out. Rendering is blocked by scripts in all browsers. In this example, main.js blocks rendering for four seconds as indicated by the green vertical line.


Figure 1: Async withOUT ControlJS waterfall chart (IE8)

Async WITH ControlJS demonstrates how ControlJS solves the blocking problems caused by scripts. Unlike the baseline example, the scripts and images are all downloaded in parallel shaving 1 second off the overall download time. Also, rendering starts immediately. If you load the two examples in your browser you’ll notice how dramatically faster the ControlJS page feels. There are three more requests in Figure 2′s waterfall chart. One is the request for control.js – this is loaded asynchronously so it doesn’t slow down the page. The other two requests are because main.js and page.js are loaded twice. The first time is when they are downloaded asynchronously as IMAGEs. Later, ControlJS inserts them into the page as SCRIPTs in order to get their JavaScript executed. Because main.js and page.js are already in the cache there’s no download penalty, only the short cache read indicated by the skinny blue line.


Figure 2: Async WITH ControlJS waterfall chart (IE8)

The ControlJS project

ControlJS is open source under the Apache License. The control.js script can be found in the ControlJS Google Code project. Discussion topics can be created in the ControlJS Google Group. The examples and such are on my website at the ControlJS Home Page. I’ve written this code as a proof of concept. I’ve only tested it on the major browsers. It needs more review before being used in a production environment.

Only part 1

This is only the first of three blog posts about ControlJS. There’s more to learn. This technique might seem like overkill. If all we wanted to do was load scripts asynchronously some other techniques might suffice. But my goal is to delay all scripts until after the page has rendered. This means that we have to figure out how to delay scripts that do document.write. Another goal is to support requirements like Gmail mobile had – the desire to download JavaScript but delay the blocking penalties that come during script execution. I’ll be talking about those features in the next two blog posts.

var name = getName();

30 Comments

Velocity: Forcing Gzip Compression

July 12, 2010 6:57 pm | 25 Comments

Tony Gentilcore was my officemate when I first started at Google. I was proud of my heritage as “the YSlow guy”. After all, YSlow was well over 1M downloads. After a few days I found out that Tony was the creator of Fasterfox – topping 11M downloads. Needless to say, we hit it off and had a great year brainstorming techniques for optimizing web performance.

During that time, Tony was working with the Google Search team and discovered something interesting: ~15% of users with gzip-capable browsers were not sending an appropriate Accept-Encoding request header. As a result, they were sent uncompressed responses that were 3x bigger resulting in slower page load times. After some investigation, Tony discovered that intermediaries (proxies and anti-virus software) were stripping or munging the Accept-Encoding header. My blog post Who’s not getting gzip? summarizes the work with links to more information. Read Tony’s chapter in Even Faster Web Sites for all the details.

Tony is now working on Chrome, but the discovery he made has fueled the work of Andy Martone and others on the Google Search team to see if they could improve page load times for users who weren’t getting compressed responses. They had an idea:

For requests with missing or mangled Accept-Encoding headers, inspect the User-Agent to identify browsers that should understand gzip.
Test their ability to decompress gzip.
If successful, send them gzipped content!

This is a valid strategy given that the HTTP spec says that, in the absence of an Accept-Encoding header, the server may send a different content encoding based on additional information (such as the encodings known to be supported by the particular client).

During his presentation at Velocity, Forcing Gzip Compression, Andy describes how Google Search implemented this technique:

  • At the bottom of a page, inject JavaScript to:
    • Check for a cookie.
    • If absent, set a session cookie saying “compression NOT ok”.
    • Write out an iframe element to the page.
  • The browser then makes a request for the iframe contents.
  • The server responds with an HTML document that is always compressed.
  • If the browser understands the compressed response, it executes the inlined JavaScript and sets the session cookie to “compression ok”.
  • On subsequent requests, if the server sees the “compression ok” cookie it can send compressed responses.

The savings are significant. An average Google Search results page is 34 kB, which compresses down to 10 kB. The ability to send a compressed response cuts page load times by ~15% for these affected users.

Andy’s slides contain more details about how to only run the test once, recommended cookie lifetimes, and details on serving the iframe. Since this discovery I’ve talked to folks at other web sites that confirm these mysterious requests that are missing an Accept-Encoding header. Check it out on your web site – 15% is a significant slice of users! If you’d like to improve their page load times, take Andy’s advice and send them a compressed response that is smaller and faster.

Belorussian translation

25 Comments

Diffable: only download the deltas

July 9, 2010 9:31 am | 15 Comments

There were many new products and projects announced at Velocity, but one that I just found out about is Diffable. It’s ironic that I missed this one given that it happened at Velocity and is from Google. The announcement was made during a whiteboard talk, so it didn’t get much attention. If your web site has large JavaScript downloads you’ll want to learn more about this performance optimization technique.

The Diffable open source project has plenty of information, including the Diffable slides used by Josh Harrison and James deBoer at Velocity. As explained in the slides, Diffable uses differential compression to reduce the size of JavaScript downloads. It makes a lot of sense. Suppose your web site has a large external script. When a new release comes out, it’s often the case that a bulk of that large script is unchanged. And yet, users have to download the entire new script even if the old script is still cached.

Josh and James work on Google Maps which has a main script that is ~300K. A typical revision for this 300K script produces patches that are less than 20K. It’s wasteful to download that other 280K if the user has the old revision in their cache. That’s the inspiration for Diffable.

Diffable is implemented on the server and the client. The server component records revision deltas so it can return a patch to bring older versions up to date. The client component (written in JavaScript) detects if an older version is cached and if necessary requests the patch to the current version. The client component knows how to merge the patch with the cached version and evals the result.

The savings are significant. Using Diffable has reduced page load times in Google Maps by more than 1200 milliseconds (~25%). Note that this benefit only affects users that have an older version of the script in cache. For Google Maps that’s 20-25% of users.

In this post I’ve used scripts as the example, but Diffable works with other resources including stylesheets and HTML. The biggest benefit is with scripts because of their notorious blocking behavior. The Diffable slides contain more information including how JSON is used as the delta format, stats that show there’s no performance hit for using eval, and how Diffable also causes the page to be enabled sooner due to faster JavaScript execution. Give it a look.

15 Comments

AutoHead – my first Browserscope user test

May 12, 2010 3:17 pm | 17 Comments

In the comments from my last blog post (appendChild vs insertBefore) someone asked which browsers do and don’t automatically create a HEAD element. This is important when you’re deciding how to dynamically add scripts to your web page. I used this question as the motivation for creating my first Browserscope user test. Here’s the story behind this new feature in Browserscope and the answer to the automatically create HEAD question. (You can run the AutoHead test to skip ahead and see the results.)

Level, Meta-level, Meta-meta-level

Level1: When Chrome was launched in 2008 I started a project called UA Profiler to analyze the performance characteristics of browsers. The key concept was to crowdsource gathering the data – publish the test framework and encourage the web community to run the tests on their browser of choice. There are numerous benefits to this approach:

  • a wider variety of browsers are tested (more than I could possibly install in a test lab)
  • results for new browsers happen immediately (often before the browser is officially released)
  • tests are performed under real world conditions

Level 2: I teamed up with Lindsey Simon to take UA Profiler to the next level. The result was the launch of Browserscope. In addition to making this a functioning open source project, the framework was opened up to include multiple test categories. In addition to performance (renamed “Network“), other test categories were added: Security, Acid3, Selectors API, and Rich Text.

Level 3: A few weeks ago Lindsey took Browserscope to the next level with the addition of the User Tests feature. Now, anyone can add a test to Browserscope. In this early alpha version of the feature, users create one or more test pages on their own server, register the test with Browserscope, and embed a JavaScript snippet at the end of their test to send the results back to Browserscope for storing. The benefit for the test creator is that Browserscope stores all the data, parses the User-Agent strings for proper categorization, and provides a widget for viewing the results.

Even though Lindsey is careful to call this an alpha, it went very smoothly for me. Once I had my test page, it took less than 15 minutes to integrate with Browserscope and start gathering results. So let’s take a look at my test…

the test – AutoHead

In my appendChild vs insertBefore blog post I talk about why this code generates bugs:

document.getElementsByTagName('head')[0].appendChild(...)

The context was using this pattern in 3rd party code snippets – where you don’t have any control of the main page. It turns out that some web pages out in the wild wild web don’t use the HEAD tag. Luckily, most browsers automatically create a HEAD element if one isn’t specified in the page. Unfortunately, not all browsers do this.

In the comments on that blog post Andy asked, “What browsers are we talking about here?”

How can I possibly attempt to answer that question? It would require running a test on many different versions of many different browsers, including mobile devices. I’m not equipped with a setup to do that.

Then the light bulb lit up. light bulb I can do this with a Browserscope User Test!

Creating the test was easy. My HTML page doesn’t have a HEAD tag. I put a script at the bottom that checks if the page contains a head element:

bHead = document.getElementsByTagName('head').length;

I have to store the result in a specific variable that Browserscope looks for:

var _bTestResults = {
'autohead': bHead
};

This data structure is slurped up by Browserscope via this snippet (as shown on the User Tests Howto page):

(function() {
var _bTestKey = '<YOUR-TEST-ID-GOES-HERE>';
var _bScript = document.createElement('script');
_bScript.src = 'http://www.browserscope.org/user/beacon/'
               + _bTestKey;
_bScript.setAttribute('async', 'true');
var scripts = document.getElementsByTagName('script');
var lastScript = scripts[scripts.length - 1];
lastScript.parentNode.insertBefore(_bScript, lastScript);
})();

Voila! You’re done. Well, almost done. You still have to promote your test.

Promoting your test

With this small amount of work I’m now ready to ask the web community to help me gather results. For me personally, I accomplish this by writing this blog post asking for help:

Help me out by running the AutoHead test. Thanks!

I’ve embedded the Browserscope results snippet in the iframe below, so you can see results as they come in. So far iPhone and Safari 3.2 are the only browsers that don’t automatically create the HEAD element.

If you want to avoid bugs when dynamically adding scripts, you might want to use one of the more solid patterns mentioned in my appendChild vs insertBefore blog post. If you want to gather data on some browser test that interests you, read the Browserscope User Test Howto and go for it. If you have problems, contact the Browserscope mailing list. If you have success, contact me and I’ll tweet your test to drive more traffic to it. This is still in alpha, but I’m very excited about the possibilities. I can’t wait to see the kinds of tests you come up with.

Update: After just one day, thanks to all of you who ran the test, I’ve collected 400 measurements on 20 different browsers and 60 unique versions. The results show that the following browsers do NOT automatically create a HEAD element: Android 1.6, Chrome 5.0.307 (strange), iPhone 3.1.3, Nokia 90, Opera 8.50, Opera 9.27, and Safari 3.2.1. This adds up to over 1% of your users, so it’s important to keep this in mind when adding scripts dynamically.

I also had some comments I wanted to pass on about my Browserscope user test. In hindsight, I wish I had chosen a better “test_key” name for the _bTestResults object. I didn’t realize this would appear as the column header in my results table. Rather than “autohead” I would have done “Automatically Creates HEAD”. Also, rather than return 0 or 1, I wish I had returned “no” and “yes”. Finally, I wish there was a way to embed the results table widget besides using an iframe. I’ll file bugs requesting better documentation for these items.

 

17 Comments

WebPagetest.org – top tool

March 5, 2010 8:04 am | 12 Comments

I’m loving WebPagetest.org. In Even Faster Web Sites I said, “[WebPagetest] hasn’t gotten the wide adoption it deserves.” It got a boost after Matt Cutts mentioned WebPageTest.org in his interview with WebProNews. But I still meet people who aren’t aware that this great performance tool is out there, so let me bang their drum some more.

Pat Meenan and Eric Goldsmith are the team behind AOL Pagetest and WebPagetest.org. AOL Pagetest is the Windows tool that works in IE. Pat took that and put it behind a web server running in his basement and called it WebPagetest. That was two years ago. He took the Open Source route and now there are instances of WebPagetest running in Virginia, California, UK, China, and New Zealand hosted by AOL as well as Strangeloop Networks, Aptimize, and Daemon Solutions.

The power of WebPagetest.org is that it’s web-based – you don’t have to do any installs and you can run it on any OS and browser. On the backend, WebPagetest runs the page in either IE7 or IE8 and displays the results. This might be a limitation for some folks – if you want to test a page on Mac OS X using Safari you’ll have to do that with some other tool. But the fact that IE 7&8 are the dominate browsers means you can see the most typical experience regardless of what platform you’re currently working on. Since many developers work on Mac using Safari or Firefox, and more are moving to Chrome, it’s important that they can easily see how their web pages load for a majority of their users.

New Test

Here’s how it works: Go to the New Test tab. Enter the URL you want to test and click submit. Pretty easy! 90% of the time that’s what I do, but there are other options you can tweak:

  • pick a geo location – VA, CA, UK, CN, or NZ
  • choose IE7 or IE8
  • choose a connection speed – Dial, DSL, FIOS (Dial and DSL are done via throttling)
  • test just the first (empty cache) page load or first and repeat
  • repeat the test up to 10 times to get a bigger sample size
  • opt to keep your test results private if desired

Results

The results page shows summary stats (page load times, bytes downloaded, # of HTTP requests) and a mini performance analysis (compression, image optimization, concatenating scripts and stylesheets). But the piece I love is the waterfall chart.

If you click on the mini waterfall, it takes you to a larger view where you can do additional customizations. I’ve been relying on this for my current series of blog posts on P3PC (performance of 3rd party content). And Pat even added a few options I requested. You can set the size of the image, remove certain requests (for example, I sometimes remove favicon.ico), and whether to show the extra bits on CPU and bandwidth utilization. I end up with clean waterfall charts like this:

And there’s more – Video!

When you create a new test, you can opt to record a video of the page loading (go to the Video tab under “Step 4 – Test Options” in the figure above). WebPagetest generates a filmstrip of images as well as a video. The image filmstrip shows what the page looks like as it loads. You can choose different time increments (0.1, 0.5, 1 and 5 seconds). Wikipedia is pretty straightforward as it loads. Here’s the filmstrip for FOX Sports. Content arrives in the 0-5 second range, more images (like the logo) are filled in from 5-10 seconds, and flash arrives by the 15 second mark. You can also view the video.

Pat does a lot of this work on his own time and all the video features are in Alpha, so be tolerant. I use WebPagetest.org daily and it has become one of my favorite performance tools. Definitely give it a try.

12 Comments

new Browserscope security tests

February 19, 2010 10:25 am | 3 Comments

Browserscope is an open source project based on my earlier UA Profiler project. The goal is to help make browsers faster, safer, and more consistent. This is accomplished by having categories of tests that measure how browsers behave in different areas. Browserscope currently has these test categories: Network, Acid3, Selectors API, Rich Text, and Security.

The project is led by Lindsey Simon. Today he blogged about updates to the Browserscope security tests. The security test category was created by Collin Jackson (CMU) and Adam Barth (UC Berkeley). They worked with David Lin-Shung Huang and Mustafa Acer (both from CMU) on today’s release of tests for HTTP Origin Header, Strict Transport Security, Sandbox Attribute, X-Frame-Options, and X-Content-Type-Options. Check out their blog post for more details on what these tests actually do.

There are other new features in today’s release. We’ve updated the list of “top” browsers (notice we dropped IE 6). Lindsey added a dropdown menu to each test category for easier navigation. I run the Network test category. In that area I broke the overloaded parallel script loading test into four more specific tests that measure whether external scripts load in parallel with images, stylesheets, iframes, and other scripts. Brian Kuhn (Google) contributed a test for measuring whether the SCRIPT ASYNC attribute is supported.

One of the key aspects of Browserscope is that all the data is crowdsourced. This is critical. It allows the project to run without requiring a dedicated test lab. And the data is gathered under real world conditions. But to be successful, we need people in the web community to participate. When you’re done reading this post, point your browser to the Browserscope test page and click “Run All Tests”. It’ll only take a few minutes and you can sit back while it walks through all the tests automatically. We’re all in this together. Join us in making the web experience faster, safer, and more consistent.

How Does Your Browser Compare?

3 Comments

Page Speed 1.6 Beta – new rules, native library

February 1, 2010 9:48 pm | 8 Comments

Page Speed 1.6 Beta was released today. There are a few big changes, but the most important fix is compatibility with Firefox 3.6. If you’re running the latest version of Firefox visit the download page to get Page Speed 1.6. Phew!

I wanted to highlight some of the new features mentioned in the 1.6 release notes: new rules and native library.

Three new rules were added as part of Page Speed 1.6:

  • Specify a character set early – If you don’t specify a character set for your web pages or specify it too low in the page, the browser could parse it incorrectly. You can specify a character set using the META tag or in the Content-Type response header. Returning charset in the Content-Type header will ensure the browser sees it early. (See this Zoompf post for more information.)
  • Minify HTML – Top performing web sites are already on top of this, right? Analyzing the Alexa U.S. top 10 shows an average savings of 8% if they minified their HTML. You can easily check your site with this new rule, and even save the optimized version.
  • Minimize Request Size – Okay, this is cool and shows how Google tries to squeeze out every last drop of performance. This rule sees if the total size of the request headers exceed one packet (~1500 bytes). Requiring a roundtrip just to submit the request hurts performance, especially for users with high latency.

The other big feature I wanted to highlight first came out in Page Speed 1.5 but didn’t get much attention – the Page Speed C++ Native Library. It probably didn’t get much attention because it’s one of those changes that, if done correctly, no one notices. The work behind the native library involves porting the rules from JavaScript to C++. Why bother? Here’s what the release notes say:

This should speed up scoring, as well as allow rules to be run in programs other than just the Page Speed Firefox extension.

Making Page Speed run faster is great, but the idea of implementing the performance logic in a C++ library so the rules can be run in other programs is very cool. And where have we seen this recently? In the Site Performance section recently added to Webmaster Tools. Now we have a server-side tool that produces the same recommendations found from running the Page Speed add-on. Here are the rules that have been ported to the native library:

added in 1.5:

  • Combine external JavaScript
  • Combine external CSS
  • Enable gzip compression
  • Optimize images
  • Minimize redirects
  • Minimize DNS lookups
  • Avoid bad requests
  • Serve resources from a consistent URL
added in 1.6:

  • specify charset early
  • Minify HTML
  • Minimize request size
  • Put CSS in the document head
  • Minify CSS
  • Optimize the order of styles and scripts
  • serve scaled images
  • specify image dimensions

Webmaster Tools Site Performance today shows recommendations based on the rules in native library 1.5. Now that more rules have been added to native library 1.6, webmasters can expect to see those recommendations in the near future. But this integration shouldn’t stop with Webmaster Tools. I’d love to see other tools and services integrate native library. If you’re interested in using native library, check out the page-speed project on Google Code and contact the page-speed-discuss Google Group.

8 Comments

Speed Tracer – visibility into the browser

December 10, 2009 7:46 am | 9 Comments

Is it just me, or does anyone else think Google’s on fire lately, lighting up the world of web performance? Quick review of news from the past two weeks:

Speed Tracer was my highlight from last night’s Google Campfire One. The event celebrated the release of GWT 2.0. Performance and “faster” were emphasized again and again throughout the evening’s presentations (I love that). GWT’s new code splitting capabilities are great for performance, but Speed Tracer easily wowed the audience – including me. In this post, I’ll describe what I like about Speed Tracer, what I hope to see added next, and then I’ll step back and talk about the state of performance profilers.

Getting started with Speed Tracer

Some quick notes about Speed Tracer:

  • It’s a Chrome extension, so it only runs in Chrome. (Chrome extensions is yet another announcement this week.)
  • It’s written in GWT 2.0.
  • It works on all web sites, even sites that don’t use GWT.

The Speed Tracer getting started page provides the details for installation. You have to be on the Chrome dev channel. Installing Speed Tracer adds a green stopwatch to the toolbar. Clicking on the icon starts Speed Tracer in a separate Chrome window. As you surf sites in the original window, the performance information is shown in the Speed Tracer window.

Beautiful visibility

When it comes to optimizing performance, developers have long been working in the dark. Without the ability to measure JavaScript execution, page layout, reflows, and HTML parsing, it’s not possible to optimize the pain points of today’s web apps. Speed Tracer gives developers visibility into these parts of page loading via the Sluggishness view, as shown here. (Click on the figure to see a full screen view.) Not only is this kind of visibility great, but the display is just, well, beautiful. Good UI and dev tools don’t often intersect, but when they do it makes development that much easier and more enjoyable.

Speed Tracer also has a Network view, with the requisite waterfall chart of HTTP requests. Performance hints are built into the tool flagging issues such as bad cache headers, exceedingly long responses, Mozilla cache hash collision, too many reflows, and uncompressed responses. Speed Tracer also supports saving and reloading the profiled information. This is extremely useful when working on bugs or analyzing performance with other team members.

Feature requests

I’m definitely going to be using Speed Tracer. For a first version, it’s extremely feature rich and robust. There are a few enhancements that will make it even stronger:

  • overall pie chart – The “breakdown by time” for phases like script evaluation and layout are available for segments within a page load. As a starting point, I’d like to see the breakdown for the entire page. When drilling down on a specific load segment, this detail is great. But having overall stats will give developers a clue where they should focus most of their attention.
  • network timing – Similar to the issues I discovered in Firebug Net Panel, long-executing JavaScript in the main page blocks the network monitor from accurately measuring the duration of HTTP requests. This will likely require changes to WebKit to record event times in the events themselves, as was done in the fix for Firefox.
  • .HAR support – Being able to save Speed Tracer’s data to file and share it is great. Recently, Firebug, HttpWatch, and DebugBar have all launched support for the HTTP Archive file format I helped create. The format is extensible, so I hope to see Speed Tracer support the .HAR file format soon. Being able to share performance information across tools and browsers is a necessary next step. That’s a good segue…

Developers need more

Three years ago, there was only one tool for profiling web pages: Firebug. Developers love working in Firefox, but sometimes you just have to profile in Internet Explorer. Luckily, over the last year we’ve seen some good profilers come out for IE including MSFast , AOL Pagetest, WebPagetest.org, and dynaTrace Ajax Edition. DynaTrace’s tool is the most recent addition, and has great visibility similar to Speed Tracer, as well as JavaScript debugging capabilities. There have been great enhancements to Web Inspector, and the Chrome team has built on top of that adding timeline and memory profiling to Chrome. And now Speed Tracer is out and bubbling to the top of the heap.

The obvious question is:

Which tool should a developer choose?

But the more important question is:

Why should a developer have to choose?

There are eight performance profilers listed here. None of them work in more than a single browser. I realize web developers are exceedingly intelligent and hardworking, but no one enjoys having to use two different tools for the same task. But that’s exactly what developers are being asked to do. To be a good developer, you have to be profiling your web site in multiple browsers. By definition, that means you have to install, learn, and update multiple tools. In addition, there are numerous quirks to keep in mind when going from one tool to another. And the features offered are not consistent across tools. It’s a real challenge to verify that your web app performs well across the major browsers. When pressed, rock star web developers I ask admit they only use one or two profilers – it’s just too hard to stay on top of a separate tool for each browser.

This week at Add-on-Con, Doug Crockford’s closing keynote is about the Future of the Web Browser. He’s assembled a panel of representatives from Chrome, Opera, Firefox, and IE. (Safari declined to attend.) My hope is they’ll discuss the need for a cross-browser extension model. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. My hope for 2010 is that we see cross-browser convergence on standards for extensions and remote debugging, so that developers will have a slightly easier path for ensuring their apps are high performance on all browsers.

9 Comments