2010 State of Performance

December 24, 2010 3:24 pm | Leave a comment

I wrote today’s post on the Performance Calendar titled “2010 State of Performance”. Here’s the concluding paragraph:

The highlights of 2010 for me were the emergence of WPO as an industry, establishment of the W3C Web Performance Working Group, strength of open source tools, adoption of the HAR format, and increased awareness of the impact of third party content. In 2011 I’m looking forward to better browser benchmarks and instrumentation, mobile tools and best practices, and faster ads. But the list is much longer than this blog post – I didn’t even mention separation of script downloading and execution, HTML5 pros and cons, improvements to browser caching, and TCP and SSL optimizations. What did you think was important in 2010 and where will the big gains come from in 2011? I think we’ll agree on one thing – the only direction to go in is faster.

Go read the full post and leave your thoughts about web performance in 2010 and 2011. We’ve got another exciting year ahead of us.

Leave a comment

appendChild vs insertBefore

May 11, 2010 12:15 am | 41 Comments

I’ve looked at a bunch of third party JavaScript snippets as part of my P3PC series. As I analyzed each of these snippets, I looked to see if scripts were being loaded dynamically. After all, this is a key ingredient for making third party content fast. It turns out nobody does dynamic loading the same way. I’d like to walk through some of the variations I found. It’s a story that touches on some of the most elegant and awful code out there, and is a commentary on the complexities of dealing with the DOM.

In early 2008 I started gathering techniques for loading scripts without blocking. I called the most popular technique the Script DOM Element approach. It’s pretty straightforward:

var domscript = document.createElement('script');
domscript.src = 'main.js';
document.getElementsByTagName('head')[0].appendChild(domscript);
Souders, May 2008

I worked with the Google Analytics team on their async snippet. The first version that came out in December 2009 also used appendChild, but instead of trying to find the HEAD element, they used a different technique for finding the parent. It turns out that not all web pages have a HEAD tag, and not all browsers will create one when it’s missing.

var ga = document.createElement('script');
ga.src = ('https:' == document.location.protocol ?
    'https://ssl' : 'http://www') +
    '.google-analytics.com/ga.js';
ga.setAttribute('async', 'true');
document.documentElement.firstChild.appendChild(ga);
Google Analytics, Dec 2009

Google Analytics is used on an incredibly diverse set of web pages, so there was lots of feedback that identified issues with using documentElement.firstChild. In February 2010 they updated the snippet with this pattern:

var ga = document.createElement('script');
ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ?
    'https://ssl' : 'http://www') +
    '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(ga, s);
Google Analytics, Feb 2010

I think this is elegant. If we’re dynamically loading scripts, we’re doing that with JavaScript, so there must be at least one SCRIPT element in the page. The Google Analytics async snippet has just come out of beta, so this pattern must be pretty rock solid.

I wanted to see how other folks were loading dynamic scripts, so I took a look at YUI Loader. It has an insertBefore variable that is used for stylesheets, so for scripts it does appendChild to the HEAD element:

if (q.insertBefore) {
  var s = _get(q.insertBefore, id);
  if (s) {
    s.parentNode.insertBefore(n, s);
  }
} else {
  h.appendChild(n);
}
YUI Loader 2.6.0, 2008

jQuery supports dynamic resource loading. Their code is very clean and elegant, and informative, too. In two pithy comments are pointers to bugs #2709 and #4378 which explain the issues with IE6 and appendChild.

head = document.getElementsByTagName ("head")[0] ||
    document.documentElement;
// Use insertBefore instead of appendChild to circumvent an IE6 bug.
// This arises when a base node is used (#2709 and #4378).
head.insertBefore(script, head.firstChild);
jQuery

All of these implementations come from leading development teams, but what’s happening in other parts of the Web? Here’s a code snippet I came across while doing my P3PC Collective Media blog post:

var f=document.getElementsByTagName("script");
var b=f[f.length-1];
if(b==null){ return; }
var i=document.createElement("script");
i.language="javascript";
i.setAttribute("type","text/javascript");
var j="";
j+="document.write('');";
var g=document.createTextNode(j);
b.parentNode.insertBefore(i,b);
appendChild(i,j);

function appendChild(a,b){
  if(null==a.canHaveChildren||a.canHaveChildren){
    a.appendChild(document.createTextNode(b));
  }
  else{ a.text=b;}
}
Collective Media, Apr 2010

Collective Media starts out in a similar way by creating a SCRIPT element. Similar to Google Analytics, it gets a list of SCRIPT elements already in the page, and chooses the last one in the list. Then insertBefore is used to insert the new dynamic SCRIPT element into the document.

Normally, this is when the script would start downloading (asynchronously), but in this case the src hasn’t been set. Instead, the script’s URL has been put inside a string of JavaScript code that does a document.write of a SCRIPT HTML tag. (If you weren’t nervous before, you should be now.) (And there’s more.) Collective Media creates a global function called, of all things, appendChild. The dynamic SCRIPT element and string of document.write code are passed to this custom version of appendChild, which injects the string of code into the SCRIPT element, causing it to be executed. The end result, after all this work, is an external script that gets downloaded in a way that blocks the page. It’s not even asynchronous!

I’d love to see Collective Media clean up their code. They’re so close to making it asynchronous and improving the page load time of anyone who includes their ads. But really, doesn’t this entire blog post seem surreal? To be discussing this level of detail and optimization for something as simple as adding a script element dynamically is a testimony to the complexity and idiosyncrasies of the DOM.

In threads and discussions about adding simpler behavior to the browser, a common response I hear from browser developers is, “But site developers can do that now. We don’t have to add a new way of doing it.” Here we can see what happens without that simpler behavior. Hundreds, maybe even thousands of person hours are spent reinventing the wheel for some common task. And some dev teams end up down a bad path. That’s why I’ve proposed some clarifications to the ASYNC and DEFER attributes for scripts, and a new POSTONLOAD attribute.

I’m hopeful that HTML5 will include some simplifications for working with the DOM, especially when it comes to improving performance. Until then, if you’re loading scripts dynamically, I recommend using the latest Google Analytics pattern or the jQuery pattern. They’re the most bulletproof. And with the kinds of third party content I’ve seen out there, we need all the bulletproofing we can get.

41 Comments

P3PC: Glam Media

April 13, 2010 10:45 am | 3 Comments

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at Glam Media. Here are the summary stats.

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
big 89 83 y 11 68 kB 63 kB 7 na**
* Stats for ads only include the ad framework and not any ad content.
** It’s not possible to gather timing stats for snippets with live ads.
column definitions

I don’t have an account with Glam Media, so my friends over at Zimbio let me use their ad codes during my testing. Since these are live (paying) ads I have to mask the ad codes in the snippet shown here. This means it’s not possible to crowdsource time measurements for these ads.

Snippet Code

Let’s look at the actual snippet code:

1: <script type=”text/javascript” language=”javascript” src=”http://www2.glam.com/app/site/affiliate/viewChannelModule.act?mName=viewAdJs&affiliateId=123456789&adSize=300×250&zone=Marketplace”>
2: </script>
snippet code as of April 12, 2010

The Glam Media ad is kicked off from a single script: viewChannelModule.act. This script is loaded using normal SCRIPT SRC tags, which causes blocking in IE7 and earlier.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. In my analysis of ad snippets I focus only on the ad framework, not on the actual ads. The Glam Media ad framework alone constitutes 9 HTTP requests.

Let’s step through each request.

  • item 1: compare.php – The HTML document.
  • item 2: viewChannelModule.act – The main Glam Media script.
  • item 3: ad.doubleclick.net – The actual ad (not included in my analysis).
  • item 4: glamadapt_jsrv.act – Script loaded by viewChannelModule.act using document.write.
  • item 5: quant.jsQuantcast script loaded by viewChannelModule.act using document.write.
  • item 6: beacon.jsScorecardResearch script loaded by viewChannelModule.act using document.write.
  • item 7: glam_comscore.js – Script loaded by viewChannelModule.act using document.write.
  • item 8: pixel – Beacon sent by quant.js.
  • item 9: b.scorecardresearch.com/b – Beacon sent by glam_comscore.js. This returns a redirect to /b2 (item 11).
  • item 10: glam-media-waterfall.png – The image representing the main page’s content.
  • item 11: altfarm.mediaplex.com/ad/js/ – The actual ad (not included in my analysis).
  • item 12: b.scorecardresearch.com/b2 – Another beacon sent as a result of the redirect from /b (item 9).

Keep in mind that glam-media-waterfall.png represents the actual content on the main page. Notice how that image is pushed back to item 10 in the waterfall chart. In this one page load, this main content is blocked for 617 + 808 = 1425 milliseconds. Here are some of the performance issues with this snippet.

1. Too many HTTP requests.

9 HTTP requests for an ad framework (not counting the ad itself) is a lot. The fact that these come from a variety of different services exacerbates the problem because more DNS lookups are required. These 9 HTTP requests are served from 6 different domains.

2. The scripts block the main content of the page from loading.

It would be better to load the script without blocking, similar to what BuySellAds.com does.

3. The ad is inserted using document.write.

Scripts that use document.write slow down the page because they can’t be loaded asynchronously. Inserting ads into a page without using document.write can be tricky. BuySellAds.com solves this problem by creating a DIV with the desired width and height to hold the ad, and then setting the DIV’s innerHTML.

4. The redirects cause sequential downloads.

A redirect is almost as bad as a script when it comes to blocking. The redirect from b.scorecardresearch.com/b to /b2 causes those two resources to happen sequentially. It would be better to avoid the redirect if possible.

5. Some resources aren’t cacheable.

glam_comscore.js has no caching headers, and yet its Last-Modified date is Nov 19, 2009 (almost 5 months ago). quant.js is only cacheable for 1 day.


Much of the content in this snippet is served with good performance characteristics. The scripts are compressed and minified. One of the beacons returns a 204 No Content response, which is a nice performance optimization. But the sheer number of HTTP requests, use of document.write, and scripts loaded in a blocking fashion cause the page to load more slowly.

3 Comments

P3PC: ValueClick

April 12, 2010 11:44 am | 1 Comment

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at ValueClick. Here are the summary stats.

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
med 89 98 y 3 2 kB 1 kB 1 na**
* Stats for ads only include the ad framework and not any ad content.
** It’s not possible to gather timing stats for snippets with live ads.
column definitions

I don’t have an account with ValueClick, so my friends over at Zimbio let me use their ad codes during my testing. Since these are live (paying) ads I have to mask the ad codes in the snippet shown here. This means it’s not possible to crowdsource time measurements for these ads.

Snippet Code

Let’s look at the actual snippet code:

1: <script language=”javascript” src=”http://media.fastclick.net/w/get.media?sid=12345&m=6&tp=8&d=j&t=s”></script>
2: <noscript><a href=”http://media.fastclick.net/w/click.here?sid=12345&m=6&c=1″ target=”_top”>
3: <img src=”http://media.fastclick.net/w/get.media?sid=12345&m=6&tp=8&d=s&c=1″ width=300 height=250 border=1></a></noscript>
snippet code as of April 7, 2010

A quick walk through the snippet code:

  • line 1 – Download the get.media script.
  • lines 2-3 – NOSCRIPT block in case JavaScript is not available.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. It shows why this snippet has a significant impact on page load time even with just a few HTTP requests and a very small download size.

Keep in mind that valueclick-waterfall.png represents the actual content on the main page. Notice how that image is pushed back to item 6 in the waterfall chart. That’s because the get.media script (item 2) is downloaded using normal SCRIPT SRC tags. This blocks all subsequent HTTP requests in older browsers including IE 6&7. (Here we’re using IE7.)

In addition, the get.media script is served through a redirect in IE (but not in Firefox). For IE a total of three sequential HTTP requests must be completed before the ad is returned. The ad is inserted using document.write, which can further block the main content on the page. In this one page load, the main content (valueclick-waterfall.png) is blocked for 338 + 345 + 163 = 846 milliseconds.

In my analysis of ad snippets I focus only on the ad framework, not on the actual ads. The ValueClick ad framework is very light – just two redirects and one small script that does document.write. Therefore, there are only a few problem areas in which to look for performance improvements, but they’re big:

1. The redirects block the page.

In IE there are two redirects in front of the get.media script. This is two roundrips from the user’s browser to the ValueClick servers and back again. The fact that these redirects don’t occur in Firefox leads me to believe that there’s a workaround for IE. Given that over 50% of Internet traffic uses IE, removing these redirects would have a positive impact on a significant number of users.

2. The get.media script blocks the main content of the page from loading.

It would be better to load the script without blocking, similar to what BuySellAds.com does.

3. The ad is inserted using document.write.

Scripts that use document.write slow down the page because they can’t be loaded asynchronously. Inserting ads into a page without using document.write can be tricky. BuySellAds.com solves this problem by creating a DIV with the desired width and height to hold the ad, and then setting the DIV’s innerHTML.


Ad networks are an amazing piece of technology. Having so many different companies share such a variety of content across millions of web sites is a real accomplishment. Techniques like document.write and scripts that block have made this possible. But the Web has evolved since these techniques were considered acceptable.

It’s critical that ad providers adopt new web development patterns so they can hit that win-win-win of a fast user experience, publisher content that renders immediately, and ads that appear quickly to drive impressions and click throughs.

1 Comment

P3PC: Google AdSense

March 29, 2010 1:44 pm | 6 Comments

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at Google AdSense. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
big 87 84 y 8 41 kB 76 kB 9 222 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

After signing up for Google AdSense, you can setup different types of ads. I chose “AdSense for Content” (listed first). Here’s what an example ad looks like. (This is a static image. Go to the Compare page to see the snippet live.)

Snippet Code

Let’s look at the actual snippet code:

1: <script type=”text/javascript”><!–
2: google_ad_client = “pub-0478442537074871″;
3: /* 300×250, created 3/6/10 */
4: google_ad_slot = “4427977761″;
5: google_ad_width = 300;
6: google_ad_height = 250;
7: //–>
8: </script>
9: <script type=”text/javascript”
10: src=”http://pagead2.googlesyndication.com/pagead/show_ads.js”>
11: </script>
snippet code as of March 10, 2010

A quick walk through the snippet code:

  • lines 2-6 – Define global variables that are used by the show_ads.js script.
  • lines 9-11 – Load the show_ads.js script.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. It reveals a lot of idiosyncrasies with scripts, browsers, and HTTP. Let’s step through each request.

  • item 1: compare.php – The HTML document.
  • item 2: show_ads.js – The main script. All other downloads are blocked because this is loaded using the SCRIPT SRC HTML tag.
  • items 3-5 – Other scripts that are loaded dynamically by show_ads.js. This dynamic loading is done by using document.write. Using document.write causes the scripts themselves to be downloaded in parallel in IE (as evidenced by the waterfall chart). However, subsequent resources are still blocked (item 7).
  • item 6: ads – The actual ad content from doubleclick.net. Google AdSense creates an iframe to hold the ad, so this is the HTML document contained in that iframe.
  • item 7: *-waterfall.png – The waterfall image in this page. This is the main content of the page. Notice how it’s blocked by the previous four scripts.
  • item 8-9: abg-en-100c-0000000.png – The “Ads by Google” image. This is loaded twice in IE because it’s referenced twice in the ad: as an IMG and as part of AlphaImageLoader.
  • item 10: sma8.js – A script loaded by the ad. Because the ad is in an iframe, this script won’t block any resources in the main page.

Now that we have a handle on the HTTP requests involved, let’s look at the most important performance issues along with recommended solutions.

1. The scripts block the main content of the page from loading.

It would be better to load the scripts without blocking. This isn’t possible with the current implementation because the ad is inserted using document.write. Calling document.write from scripts loaded asynchronously may lead to ads being inserted in the wrong location or potentially the entire page being blank. The ideal solution would be create a DIV with the desired width and height to hold the ad and load the scripts asynchronously, similar to what BuySellAds.com does.

2. Most of the resources are only cacheable for a day.

It’s understandable that show_ads.js is only cacheable for a day. If this script changed (bug fix, new feature), there would be no way to rev the filename (since the snippet is embedded in the publishers’ pages). A short expiration date ensures users will get the updated version sooner (within a day). However, expansion_embed.js, abg-en-100c-000000.png, and sma8.js are also only cacheable for a day. These should have a far future expiration date (a year or more). If there was a change to expansion_embed.js (for example), the new version could be pushed with a modified filename (expansion_embed.1.1.js) and the code in show_ads.js could be modified to reference this new filename.

3. abg-en-100c-000000.png is downloaded twice.

This is happening because an AlphaImageLoader filter is used to achieve alpha transparency in IE 6. The HTML looks like this:

1: <span style=”display:inline-block;height:16px;width:78;
2:     pxfilter:progid:DXImageTransform.Microsoft.AlphaImageLoader(
3:     src=’http://…/abg-en-100c-000000.png’);”>
4: <img src=http://…/abg-en-100c-000000.png
5:     alt=”Ads by Google” border=0 height=16 width=78
6:     style=filter:progid:DXImageTransform.Microsoft.Alpha(opacity=0)>
7: </span>

Notice that abg-en-100c-000000.png is used in the src in both lines 3 and 4. I’m not sure why the IMG is being used with an opacity of 0 (preserve space?), but I bet there’s a workaround. Also, this should only be necessary in IE 6, so the AlphaImageLoader should be skipped for IE 7&8.

4. Five scripts are downloaded.

Some of the scripts could be combined to reduce the number of HTTP requests.


6 Comments

P3PC: Quantcast

March 23, 2010 5:57 pm | Leave a comment

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at Quantcast. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
small 93 98 n 2 3 kB 3 kB 2 53 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

Quantcast does web analytics. They go beyond the typical traffic stats providing information about user demographics for advertisers. It was easy to signup for Quantcast and embed their snippet. Go to the Compare page to see the snippet in action.

Snippet Code

Let’s look at the actual snippet code:

1: <!– Start Quantcast tag –>
2: <script type=”text/javascript”>
3: _qoptions={
4: qacct:”p-d0TvozDaU_91o”
5: };
6: </script>
7: <script type=”text/javascript” src=”http://edge.quantserve.com/quant.js”></script>
8: <noscript>
9: <img src=”http://pixel.quantserve.com/pixel/p-d0TvozDaU_91o.gif” style=”display: none;” border=”0″ height=”1″ width=”1″ alt=”Quantcast”/>
10: </noscript>
11: <!– End Quantcast tag –>
snippet code as of March 22, 2010

A quick walk through the snippet code:

  • lines 3-4 – Define some variables used by the script.
  • line 7 – Load the quant.js script.
  • lines 8-10 – Provide a NOSCRIPT block that loads an image beacon using HTML.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. Let’s step through each request.

  • item 1: compare.php – The HTML document.
  • item 2: quant.js – The main Quantcast script. This is loaded using normal SCRIPT SRC HTML tags, so it blocks subsequent resources in IE7 and older browsers, but not in newer browsers.
  • item 3: pixel.quantserve.com/pixel.gif – A beacon back to Quantcast.
  • item 4: *-waterfall.png – The waterfall image in this page. This is the main content of the page. Notice how it’s blocked by quant.js in IE7.

The Quantcast snippet is fairly performant. The few things worth noting are:

1. The quant.js script blocks resources and rendering.

The quant.js script is loaded using normal SCRIPT SRC HTML tags. Newer browsers (IE 8, Firefox 3.6, Safari 4, Chrome 2+) download this in parallel with subsequent resources. But in IE 6&7 and Opera all subsequent resources are blocked until quant.js is done downloading, as shown in the waterfall chart. In all browsers, all DOM elements below the SCRIPT tag are blocked from rendering and all JavaScript is blocked from executing. Since nothing else in the page depends on quant.js, it would be better to load it asynchronously, as is done with Google Analytics’ async snippet.

2. quant.js is only cacheable for one day.

This is the script that publishers add to their pages. As such, it has to have a short expiration time so that the end users will update their cache somewhat frequently to get bug fixes and other updates. However, one day might be too aggressive. As a point of comparison, Google Analytics’ ga.js is cacheable for one week.

3. The beacon returns a 200 HTTP status code.

I recommend returning a 204 (No Content) status code for beacons. A 204 response has no body and browsers will never cache them, which is exactly what we want from a beacon. In this case, the image body is less than 100 bytes, and the beacon’s HTTP headers prevent it from being cached. Although the savings are minimal, using a 204 response for beacons is a good best practice. Quantcast’s NOSCRIPT beacon, on the other hand, should return a 200 status code to avoid the browser thinking there’s an error.

Overall, Quantcast has a small impact on page performance. The most important improvement would be to load quant.js asynchronously.

Other posts in the P3PC series:


Leave a comment

P3PC: BuySellAds.com

March 16, 2010 7:09 am | 4 Comments

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at BuySellAds.com. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
small 81 92 n 3 7 kB 14 kB 9 28 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

After signing up for BuySellAds.com, you can setup different types of ads. I chose an image-only 125×125 ad. Since this is a test page, I can’t get real ads. Check out Webdesigner Depot or All Things Cupcake to see some real ads. The folks at BuySellAds.com set me up with a test ad for demo purposes. Here’s what the test ad looks like. (This is a static image. Go to the Compare page to see the snippet live.)

Snippet Code

Let’s look at the actual snippet code:

1: <!– BuySellAds.com Ad Code –>
2: <script type=”text/javascript”>
3:     (function(){
4:         var bsa = document.createElement(‘script’);
5:         bsa.type = ‘text/javascript’;
6:         bsa.async = true;
7:         bsa.src = ‘//s3.buysellads.com/ac/bsa.js’;
8:         (document.getElementsByTagName(‘head’)[0] || document.getElementsByTagName(‘body’)[0]).appendChild(bsa);
9:     })();
10: </script>
11: <!– END BuySellAds.com Ad Code –>
12: <!– BuySellAds.com Zone Code –>
13: <div id=”bsap_1245700″ class=”bsarocks bsap_84a5f8f4c8e4c1bb2c57948fba2d9cc4″></div>
14: <!– END BuySellAds.com Zone Code –>
snippet code as of March 14, 2010

A quick walk through the snippet code:

  • lines 3-9 – Dynamically load the bsa.js script.
  • line 13 – Create a DIV to hold the ad.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. Let’s step through each request.

  • item 1: compare.php – The HTML document.
  • item 2: bsa.js – The main script. This is loaded dynamically, so it doesn’t block other downloads.
  • item 3: *-waterfall.png – The waterfall image in this page. This is the main content of the page. Notice how it loads in parallel with bsa.js.
  • item 4: s_84a5f8f4c8e4c1bb2c57948fba2d9cc4.js – A JSON response containing the ad content. This resource was added dynamically by bsa.js.
  • item 6: 18446-1268342919.png – The image contained in the test ad.
  • item 7: imp.gif – A beacon.

The amazing thing about the BuySellAds.com snippet is that it loads ads asynchronously. Most web developers are familiar with the performance delays inflicted by ads with scripts that block the main content in the page, or even worse scripts that use document.write so any hope of parallelization is dashed. BuySellAds.com is the only ad snippet that I’ve seen that avoids these blocking issues. (If you know of others, please add a comment mentioning them.)

Asynchronous loading is achieved as a result of two things:

  1. dynamically loading bsa.js (as opposed to using normal SCRIPT SRC HTML tags)
  2. creating a DIV placeholder for the ad content (as opposed to using document.write)

How is the ad actually loaded into the DIV? The bsa.js script dynamically adds a script (s_84a5f8f4c8e4c1bb2c57948fba2d9cc4.js) containing the ad as a JSON response. That JSON response calls a function from bsa.js (interpret_json) that extracts the DIV’s id from the JSON object and sets its innerHTML. I like how the DIV’s id and classname are used, as opposed to doing this through JavaScript variables set in the snippet.

Loading ads asynchronously is a big advantage of BuySellAds.com. But there are still a few more performance improvements that could be made.

1. The size of the DIV changes causing the page to re-layout.

I used WebPagetest.org to create a filmstrip of images showing the page loading. Notice how the waterfall chart appears at 1.5 seconds. At 2.0 seconds the ad is loaded causing the waterfall chart to shift downward. It would better if the snippet set the DIV’s width and height to the appropriate values for the selected ad size.

2. bsa.js isn’t cached.

This is the script that publishers add to their pages. As such, it has to have a short expiration time so that the file cached by users is updated frequently. However, no expiration date causes browsers to check for updates too frequently. A 1 day or 1 week expiration date would strike a better balance between performance and update frequency.

3. The beacon returns a 200 HTTP status code.

I recommend returning a 204 (No Content) status code. A 204 response has no body and browsers will never cache them, which is exactly what we want from a beacon. In this case, the image body is less than 100 bytes, and the beacon’s HTTP headers prevent it from being cached. Although the savings are minimal, using a 204 response for beacons is a good best practice.

Hats off to the folks at BuySellAds.com for showing that asynchronous ads are possible. I’ll examine a few more ad snippets in the coming weeks. We’ll see how they stack up when it comes to performance.

Other posts in the P3PC series:


4 Comments

P3PC: Google Analytics

March 3, 2010 7:25 pm | 16 Comments

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at Google Analytics. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
small 91 99 n 2 18 kB 24 kB 2 19 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

Google Analytics recently had a nice performance upgrade. The old snippet used document.write and thus blocked other resources and rendering, making pages feel slower. On Dec 1, 2009, Google announced the launch of their new asynchronous snippet. The old snippet still works, but sites that want a significant speedup should use the new async snippet.

My analysis focuses on the new async snippet for Google Analytics.

Snippet Code

Let’s look at the actual snippet code:

1: <script type=”text/javascript”>
2: var _gaq = _gaq || [];
3: _gaq.push(['_setAccount', 'UA-15026169-1']);
4: _gaq.push(['_trackPageview']);
5:
6: (function() {
7: var ga = document.createElement(‘script’); ga.type = ‘text/javascript’; ga.async = true;
8: ga.src = (‘https:’ == document.location.protocol ? ‘https://ssl’ : ‘http://www’) + ‘.google-analytics.com/ga.js’;
9: (document.getElementsByTagName(‘head’)[0] || document.getElementsByTagName(‘body’)[0]).appendChild(ga);
10: })();
11: </script>
snippet code as of March 3, 2010

A quick walk through the snippet code:

  • lines 2-4 – Push commands onto a queue to be executed once the async script finishes loading.
  • lines 6-10 – Create a script DOM element and set the src to fetch ga.js.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. Notice how the image in the main page (google-analytics-waterfall.png) is not blocked by ga.js – they both download in parallel.

Here are the most important performance issues along with recommended solutions.

1. 24 kB seems like a lot of JavaScript for sending a tracking beacon.

My example is very straightforward and doesn’t use much Google Analytics functionality. The three sites I run that use Google Analytics are all simple. I’d like to see a “lite” version of ga.js for sites like mine and whittle that 24 kB down to 5 kB or so.

2. The beacon returns a 200 HTTP status code.

I recommend returning a 204 (No Content) status code. A 204 response has no body and browsers will never cache them, which is exactly what we want from a beacon. In this case, no body only saves 35 bytes, and the beacon’s HTTP headers prevent it from being cached. Although the savings are minimal, using a 204 response for beacons is a good best practice.

What you can do now: The new async version of the Google Analytics snippet is pretty slim. Even YSlow and Page Speed can’t find much wrong with it. My main advice would be to switch over to this async version if you’re still using the old document.write snippet. Cruising through the Alexa U.S. top 50 web sites I found two web sites that use the new async snippet: Huffington Post and Answers.com. But these web sites haven’t moved over yet: Twitter, FOXNews.com, Reference.com, Photobucket, Hulu, DoubleClick.com, and (ahem) Blogger. To be fair, the new snippet was launched as Beta just a few months ago, but all of these sites should switchover to the async snippet and speed things up for their users.


16 Comments

P3PC: Facebook Share

March 1, 2010 8:35 pm | 1 Comment

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at Facebook Share. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
small 90 92 n 5 8 kB 7 kB 15 104 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

Facebook Share is a way to share a URL with the Facebook community. Here’s what it looks like. (This is a static image. Go to the Compare page to see the live widget.)

Snippet Code

Let’s look at the actual snippet code:

1: <a name=”fb_share” type=”button_count” share_url=”http://stevesouders.com/” href=”http://www.facebook.com/sharer.php”>Share</a>
2: <script src=”http://static.ak.fbcdn.net/connect.php/js/FB.Share” type=”text/javascript”></script>
snippet code as of March 1, 2010

A quick walk through the snippet code:

  • line 1 – The anchor that will be filled in later.
  • line 2 – The FB.Share script is downloaded. The actual code is minified, but I’ve expanded some of it here for easier readability. The _onFirst function is called (line 1 below). _onFirst inserts the share-button-css stylesheet (lines 2-6) and then calls renderPass. renderPass calls fetchData which inserts the restserver.php script (lines 12-14). The insert function appends the DOM element (stylesheet or script) to the head or body of the document (line 19).
    1: _onFirst: function() {
    2: var a=document.createElement(‘link’);
    3: a.rel=’stylesheet’;
    4: a.type=’text/css’;
    5: a.href=’http://static.ak.fbcdn.net/connect.php/css/share-button-css’;
    6: this.insert(a);
    7: renderPass();
    8: [...]
    9: },
    10:
    11: fetchData: function() {
    12: var a=document.createElement(‘script’);
    13: a.src=this.addQS(‘http://api.ak.facebook.com/restserver.php’, {v:’1.0′,method:[...]});
    14: this.insert(a);
    15: [...]
    16: },
    17:
    18: insert: function(a) {
    19: (document.getElementsByTagName(‘HEAD’)[0]||document.body).appendChild(a);
    20: },

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. Item 5 (facebook-sharer-waterfall.png) is the first resource that’s part of the main page. Notice how it’s blocked by the first Facebook Share script, but then loads in parallel with the widget’s stylesheet and second script.

Here are the most important performance issues along with recommended solutions.

1. The FB.Share script blocks resources and rendering.

It’s great that the stylesheet (share-button-css) and second script (restserver.php) are loading dynamically and don’t block the main page’s resources (facebook-sharer-waterfall.png). But the first script (FB.Share) is loaded with the typical SCRIPT SRC tags and so does block in older browsers, and even in newer browsers will have some blocking effects. Because there are no code dependencies and no use of document.write (yay!), the FB.Share script can also be loaded asynchronously. Here’s what the async snippet might look like. (Warning: I haven’t tested this on all browsers.)

1: <a name=”fb_share” type=”button_count” share_url=”http://stevesouders.com/” href=”http://www.facebook.com/sharer.php”>Share</a>
2: <script type=”text/javascript”>
3: (function() {
4: var domscript = document.createElement(“script”);
5: domscript.type = “text/javascript”;
6: domscript.src = “http://static.ak.fbcdn.net/connect.php/js/FB.Share”;
7: (document.getElementsByTagName(“head”)[0] || document.getElementsByTagName(“body”)[0]).appendChild(domscript);
8: }());
9: </script>

This async snippet is used in the Facebook Sharer Improved example. The HTTP waterfall chart for this async snippet is shown below. The most important thing about loading the FB.Share script asynchronously is that the main page’s content can load more quickly. Notice how the image in the main page (facebook-sharer-waterfall.png) moves from item 5 to item 3, and its load time moves from ~1100 ms to ~780 ms. Another benefit is that the overall page load time is faster, dropping from ~1100 ms to ~900 ms.

2. share-button-css is only cached for a few minutes

share-button-css should be given a far future expiration date. If it’s changed, the filename could be modified in FB.Share guaranteeing that everyone got the updated file.

3. The CSS could be reduced.

Page Speed reports that 50% (2.9 kB) of the CSS isn’t used. It’s possible the CSS is used in other manifestations of the widget but not in this default view.

What you can do now: Facebook Share is a lightweight widget as widgets go. In addition to a small transfer size and small amount of JavaScript, its images and CSS selectors are also optimized. But if you wanted to reduce the impact even farther, you could try loading FB.Share asynchronously.


1 Comment

P3PC: Digg Widget

February 28, 2010 1:33 pm | 10 Comments

Update: In March, 2010 Digg revamped their snippet to have much better performance. See their blog post Speeding Along

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. This blog post looks at the Digg Widget. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
big 90 84 y 9 52 kB 107 kB 84 667 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

The Add a Digg Widget page describes how to insert this widget into a page. Here’s what it looks like:

Snippet Code

Let’s look at the actual snippet code:

1: <script type=”text/javascript”>
2: digg_id = ‘digg-widget-container’; //make this id unique for each widget you put on a single page.
3: digg_title = ‘Top 10 list from Technology’;
4: </script>
5: <script type=”text/javascript” src=”http://digg.com/tools/widgetjs”></script>
6: <script type=”text/javascript” src=”http://digg.com/tools/services?type=javascript&amp;callback=diggwb&amp;endPoint=%2Fstories%2Fcontainer%2Ftechnology%2Ftop&amp;count=10″></script>
snippet code as of Feb 23, 2010

A quick walk through the snippet code:

  • lines 1-4 – Variables used later in the loaded scripts.
  • line 5 – The “widgetjs” script is where most of the action takes place.
    This script uses document.write to load a stylesheet and two more scripts:

    1: document.write(‘<link rel=”stylesheet” type=”text/css” media=”all” href=”http://digg.com/css/widget.css” />’);
    2: document.write(‘<script type=”text/javascript” src=”http://cotnet.diggstatic.com/js/loader/380/JS_Libraries,jquery|JS_Libraries,jquery-noconflict|jquery-dom”></script>’);
    3: document.write(‘<script type=”text/javascript” src=”http://digg.com/tools/widgetjsvars”></script>’);

    The “widgetjsvars” script does several document.writes including loading another script:

    1: document.write(‘[...]<script src=”http://cotnet.diggstatic.com/js/loader/395/omnidiggthis” type=”text/javascript”></script>’);
  • line 6 – The “/tools/services” script contains the data (stories) for the widget.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. Item 10 (digg-waterfall.png) is the first resource that’s part of the main page. Notice how it’s blocked by the Digg widget’s scripts.

Here are the most important performance issues along with recommended solutions.

  1. 9 HTTP requests, 52 kB transferred over the wire, and 107 kB of JavaScript (uncompressed) is a lot of content for a single widget.
    Recommendations:

    • Concatenate these three scripts: JS_Libraries, widgetjsvars, and omnidiggthis. (eliminates 2 HTTP requests)
    • Run Page Speed’s “Defer loading JavaScript” feature and see how much of the JavaScript is not used. If it’s sizable, delete it. (This feature is currently broken in the latest version of Page Speed, but a fix is imminent.) (eliminates ?? kB)
    • Optimize the images – widget-logo.png and get-widget.png can both be reduced by ~3 kB. (eliminates ~6 kB)
    • Sprite widget-logo.png and shade-com.png. (eliminates 1 HTTP request)
  2. The widget’s scripts block the main page’s content from downloading. Looking at the waterfall chart, the main page includes the image “digg-waterfall.png” (row 10). Notice how this image doesn’t start downloading until after all the scripts for the Digg widget are received.
    Recommendations:

    • Instead of loading the scripts using document.write, load them without blocking other downloads. The scripts are already suffering from race condition behavior, as evidenced by this comment from widgetjsvars:
      1: if (!digg || !digg.$) setTimeout(function() { diggwb(obj); }, 200); //hack for IE not loading scripts that are included via document.write until it decides too

    So it probably isn’t too much work to avoid race conditions when making all the scripts load asynchronously.

  3. The widget’s stylesheet blocks the main page from rendering in IE.
    Recommendations:

    • Instead of loading the stylesheet using document.write, load it via JavaScript as described in 5d dynamic stylesheets.
  4. Four of the resources aren’t cached long enough.
    Recommendations:

    • Two scripts aren’t cacheable because they have an expiration date in the past. widgetjs is part of the snippet, so it can’t have a long expiration date, but something like an hour or a day would be better than a date in the past. widgetjsvars could have a far future expiration date since its URL is specified in widgetjs.
    • The three images are only cacheable for a day. They should have a far future expires header since the image filename can be change if it’s modified.
  5. There are approximately 30 inefficient CSS selectors. Because this stylesheet is part of the main page, the selectors will cause the overall page to render more slowly when these selectors are applied to the elements in the main page.
    Recommendations:

  6. Four of the resources have ETags which reduces their cacheability.
    Recommendations:

    • Configure the ETags for widget.css, widget-logo.png, get-widget.png, and shade-com.png.

What you can do now: Because the Digg Widget uses document.write, the best thing you can do to reduce the impact it has on your page is to put it in an iframe. This will remove the blocking effect it has on your page.


10 Comments