P3PC: BuySellAds.com

March 16, 2010 7:09 am | 4 Comments

P3PC is a project to review the performance of 3rd party content such as ads, widgets, and analytics. You can see all the reviews and stats on the P3PC home page. This blog post looks at BuySellAds.com. Here are the summary stats:

impact on page Page Speed YSlow doc.
write
total reqs total xfer size JS ungzip DOM elems median Δ load time
small 81 92 n 3 7 kB 14 kB 9 28 ms
column definitions
Click here to see how your browser performs compared to the median load time shown above.

After signing up for BuySellAds.com, you can setup different types of ads. I chose an image-only 125×125 ad. Since this is a test page, I can’t get real ads. Check out Webdesigner Depot or All Things Cupcake to see some real ads. The folks at BuySellAds.com set me up with a test ad for demo purposes. Here’s what the test ad looks like. (This is a static image. Go to the Compare page to see the snippet live.)

Snippet Code

Let’s look at the actual snippet code:

1: <!– BuySellAds.com Ad Code –>
2: <script type=”text/javascript”>
3:     (function(){
4:         var bsa = document.createElement(‘script’);
5:         bsa.type = ‘text/javascript’;
6:         bsa.async = true;
7:         bsa.src = ‘//s3.buysellads.com/ac/bsa.js’;
8:         (document.getElementsByTagName(‘head’)[0] || document.getElementsByTagName(‘body’)[0]).appendChild(bsa);
9:     })();
10: </script>
11: <!– END BuySellAds.com Ad Code –>
12: <!– BuySellAds.com Zone Code –>
13: <div id=”bsap_1245700″ class=”bsarocks bsap_84a5f8f4c8e4c1bb2c57948fba2d9cc4″></div>
14: <!– END BuySellAds.com Zone Code –>
snippet code as of March 14, 2010

A quick walk through the snippet code:

  • lines 3-9 – Dynamically load the bsa.js script.
  • line 13 – Create a DIV to hold the ad.

Performance Analysis

This HTTP waterfall chart was generated by WebPagetest.org using IE 7 with a 1.5Mbps connection from Dulles, VA. Let’s step through each request.

  • item 1: compare.php – The HTML document.
  • item 2: bsa.js – The main script. This is loaded dynamically, so it doesn’t block other downloads.
  • item 3: *-waterfall.png – The waterfall image in this page. This is the main content of the page. Notice how it loads in parallel with bsa.js.
  • item 4: s_84a5f8f4c8e4c1bb2c57948fba2d9cc4.js – A JSON response containing the ad content. This resource was added dynamically by bsa.js.
  • item 6: 18446-1268342919.png – The image contained in the test ad.
  • item 7: imp.gif – A beacon.

The amazing thing about the BuySellAds.com snippet is that it loads ads asynchronously. Most web developers are familiar with the performance delays inflicted by ads with scripts that block the main content in the page, or even worse scripts that use document.write so any hope of parallelization is dashed. BuySellAds.com is the only ad snippet that I’ve seen that avoids these blocking issues. (If you know of others, please add a comment mentioning them.)

Asynchronous loading is achieved as a result of two things:

  1. dynamically loading bsa.js (as opposed to using normal SCRIPT SRC HTML tags)
  2. creating a DIV placeholder for the ad content (as opposed to using document.write)

How is the ad actually loaded into the DIV? The bsa.js script dynamically adds a script (s_84a5f8f4c8e4c1bb2c57948fba2d9cc4.js) containing the ad as a JSON response. That JSON response calls a function from bsa.js (interpret_json) that extracts the DIV’s id from the JSON object and sets its innerHTML. I like how the DIV’s id and classname are used, as opposed to doing this through JavaScript variables set in the snippet.

Loading ads asynchronously is a big advantage of BuySellAds.com. But there are still a few more performance improvements that could be made.

1. The size of the DIV changes causing the page to re-layout.

I used WebPagetest.org to create a filmstrip of images showing the page loading. Notice how the waterfall chart appears at 1.5 seconds. At 2.0 seconds the ad is loaded causing the waterfall chart to shift downward. It would better if the snippet set the DIV’s width and height to the appropriate values for the selected ad size.

2. bsa.js isn’t cached.

This is the script that publishers add to their pages. As such, it has to have a short expiration time so that the file cached by users is updated frequently. However, no expiration date causes browsers to check for updates too frequently. A 1 day or 1 week expiration date would strike a better balance between performance and update frequency.

3. The beacon returns a 200 HTTP status code.

I recommend returning a 204 (No Content) status code. A 204 response has no body and browsers will never cache them, which is exactly what we want from a beacon. In this case, the image body is less than 100 bytes, and the beacon’s HTTP headers prevent it from being cached. Although the savings are minimal, using a 204 response for beacons is a good best practice.

Hats off to the folks at BuySellAds.com for showing that asynchronous ads are possible. I’ll examine a few more ad snippets in the coming weeks. We’ll see how they stack up when it comes to performance.

Other posts in the P3PC series:


4 Comments

Tethering is exhilarating

February 23, 2010 10:25 am | 40 Comments

I love my iPhone. That sentiment doubled the day I followed Nat Torkington’s pointer to Ten Second iPhone Tethering. Later that month I flew to Boston and basked in the freedom of having an Internet connection at the airport and hotel without paying wifi fees.

Then it all came crashing down. iPhone 3.1 came out. I had to choose between visual voicemail and tethering or consider jailbreaking my iPhone. Tech support in my household is limited (me) so I said goodbye to tethering. I’m back to paying hotels $10 per day to use their wifi, or signing up for a day of T-Mobile Hotspot usage at Starbucks.

Then I got my Nexus One. I really like it. It’s a huge improvement over the G1 I got last year. iPhone is still my dominant phone, but I carry the Nexus One and Palm Pre with me and am spending more time on the Nexus One.

I’m gearing up for some travel so revisited the topic of tethering. I was stunned when I spoke to AT&T tech support two days ago and they told me they support tethering. How did I miss this?! Then the guy said I had to jailbreak my iPhone. It seems weird to have tech support recommend jailbreaking. I guess that’s a result of the AT&T/Apple love/hate relationship. Same story with Palm Pre – gotta jailbreak it.

My hopes rose when I found articles saying you could tether with Nexus One. I installed PdaNet. That went smoothly. It works on Mac and Windows. I’m Mac at home but when I travel I take my Windows laptop, so that’s the critical platform for tethering. I’m always wary of new installations bogging down Windows, but PdaNetPC.exe is only 17M of memory and 0% of CPU when not in use, so I’m fine with that running in the background.

I tested it last night at home, but the real test was this morning. I stopped for coffee at Peets, booted up Windows, tethered my Nexus One, opened a ssh session, and drove to work. At every stoplight I verified my ssh session session was still active. I was reading email, surfing the Web.  It was exhilarating. I know that’s incredibly geeky to say, but I revel in the freedom it gives me. +1 for tethering without jailbreaking. All smartphones should do this.

40 Comments

Stanford videos available

May 20, 2009 11:46 pm | 9 Comments

Last fall I taught CS193H: High Performance Web Sites at Stanford. My class was videotaped so people enrolled through the Stanford Center for Professional Development could watch at offhours. In an earlier blog post I mentioned that SCPD was working to make the videos available. I’m pleased to announce that you can now watch these lectures on SCPD as part of the XCS193H videos. Yep, 25 hours of me talking about web performance! These videos include lectures on all the rules from my first book, High Performance Web Sites, as well as the new material from my next book, Even Faster Web Sites, due out in June

The videos aren’t free – tuition is $600. If this is too pricey, you can watch the first three videos for free. These videos are the most thorough explanation of my performance best practices. I hope you’ll check them out.

9 Comments

Loading Scripts Without Blocking

April 27, 2009 10:49 pm | 46 Comments

This post is based on a chapter from Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.

As more and more sites evolve into “Web 2.0″ apps, the amount of JavaScript increases. This is a performance concern because scripts have a negative impact on page performance. Mainstream browsers (i.e., IE 6 and 7)  block in two ways:

  • Resources in the page are blocked from downloading if they are below the script.
  • Elements are blocked from rendering if they are below the script.

The Scripts Block Downloads example demonstrates this. It contains two external scripts followed by an image, a stylesheet, and an iframe. The HTTP waterfall chart from loading this example in IE7 shows that the first script blocks all downloads, then the second script blocks all downloads, and finally the image, stylesheet, and iframe all download in parallel. Watching the page render, you’ll notice that the paragraph of text above the script renders immediately. However, the rest of the text in the HTML document is blocked from rendering until all the scripts are done loading.

Scripts block downloads in IE6&7, Firefox 2&3.0, Safari 3, Chrome 1, and Opera

Browsers are single threaded, so it’s understandable that while a script is executing the browser is unable to start other downloads. But there’s no reason that while the script is downloading the browser can’t start downloading other resources. And that’s exactly what newer browsers, including Internet Explorer 8, Safari 4, and Chrome 2, have done. The HTTP waterfall chart for the Scripts Block Downloads example in IE8 shows the scripts do indeed download in parallel, and the stylesheet is included in that parallel download. But the image and iframe are still blocked. Safari 4 and Chrome 2 behave in a similar way. Parallel downloading improves, but is still not as much as it could be.

Scripts still block, even in IE8, Safari 4, and Chrome 2

Fortunately, there are ways to get scripts to download without blocking any other resources in the page, even in older browsers. Unfortunately, it’s up to the web developer to do the heavy lifting.

There are six main techniques for downloading scripts without blocking:

  • XHR Eval – Download the script via XHR and eval() the responseText.
  • XHR Injection – Download the script via XHR and inject it into the page by creating a script element and setting its text property to the responseText.
  • Script in Iframe – Wrap your script in an HTML page and download it as an iframe.
  • Script DOM Element – Create a script element and set its src property to the script’s URL.
  • Script Defer – Add the script tag’s defer attribute. This used to only work in IE, but is now in Firefox 3.1.
  • document.write Script Tag – Write the <script src=""> HTML into the page using document.write. This only loads script without blocking in IE.

You can see an example of each technique using Cuzillion. It turns out that these techniques have several important differences, as shown in the following table. Most of them provide parallel downloads, although Script Defer and document.write Script Tag are mixed. Some of the techniques can’t be used on cross-site scripts, and some require slight modifications to your existing scripts to get them to work. An area of differentiation that’s not widely discussed is whether the technique triggers the browser’s busy indicators (status bar, progress bar, tab icon, and cursor). If you’re loading multiple scripts that depend on each other, you’ll need a technique that preserves execution order.

Technique Parallel Downloads Domains can Differ Existing Scripts Busy Indicators Ensures Order Size (bytes)
XHR Eval IE, FF, Saf, Chr, Op no no Saf, Chr - ~500
XHR Injection IE, FF, Saf, Chr, Op no yes Saf, Chr - ~500
Script in Iframe IE, FF, Saf, Chr, Op no no IE, FF, Saf, Chr - ~50
Script DOM Element IE, FF, Saf, Chr, Op yes yes FF, Saf, Chr FF, Op ~200
Script Defer IE, Saf4, Chr2, FF3.1 yes yes IE, FF, Saf, Chr, Op IE, FF, Saf, Chr, Op ~50
document.write Script Tag IE, Saf4, Chr2, Op yes yes IE, FF, Saf, Chr, Op IE, FF, Saf, Chr, Op ~100

The question is: Which is the best technique? The optimal technique depends on your situation. This decision tree should be used as a guide. It’s not as complex as it looks. Only three variables determine the outcome: is the script on the same domain as the main page, is it necessary to preserve execution order, and should the busy indicators be triggered.

Decision tree for optimal async script loading technique

Ideally, the logic in this decision tree would be encapsulated in popular HTML templating languages (PHP, Python, Perl, etc.) so that the web developer could just call a function and be assured that their script gets loaded using the optimal technique.

In many situations, the Script DOM Element is a good choice. It works in all browsers, doesn’t have any cross-site scripting restrictions, is fairly simple to implement, and is well understood. The one catch is that it doesn’t preserve execution order across all browsers. If you have multiple scripts that depend on each other, you’ll need to concatenate them or use a different technique. If you have an inline script that depends on the external script, you’ll need to synchronize them. I call this “coupling” and present several ways to do this in Coupling Asynchronous Scripts.

46 Comments

Even Faster Web Sites

April 23, 2009 12:23 am | 12 Comments

This post introduces Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.

Last April, I blogged about starting a follow-up to my first book, High Performance Web Sites. Last week I sent in the first round of final edits. Although there will likely be one or two more rounds of edits, they should be small. So, I’m feeling pretty much done. It’s a huge weight off my shoulders. I’ve been working on this book for more than a year. The performance best practices I present required more research than HPWS. I also expanded my testing from just IE and Firefox (as I did in HPWS) to IE, Firefox, Safari, Chrome, and Opera (including multiple versions of each).

The title of this new book is Even Faster Web Sites. It will be published in June, and is available for pre-order now on Amazon and O’Reilly. The cover of HPWS was a greyhound. EFWS’ cover is the Blackbuck Antelope – it can hit 50 mph which is in the top 5 for land animals. (Fastest is cheetah, but that’s taken by Programming the Perl DBI.)

The most exciting thing about EFWS is that it includes six chapters from contributing authors. This came about because I wanted to have best practices for JavaScript performance. I’m a pretty good JavaScript programmer, but not nearly as good as the JavaScript luminaries out there who are writing books and teaching workshops. I also wanted a chapter on image optimization, where Stoyan Stefanov and Nicole Sullivan are the experts. I reached out to folks in these and other areas to contribute performance best practices that they had accumulated. The resulting chapters are listed below. I’ve indicated the contributing authors where appropriate; otherwise, the chapter is written by me.

  1. Understanding Ajax Performance – Doug Crockford
  2. Creating Responsive Web Applications – Ben Galbraith and Dion Almaer
  3. Splitting the Initial Payload
  4. Loading Scripts Without Blocking
  5. Coupling Asynchronous Scripts
  6. Positioning Inline Scripts
  7. Writing Efficient JavaScript – Nicholas C. Zakas
  8. Scaling with Comet – Dylan Schiemann
  9. Going Beyond Gzipping – Tony Gentilcore
  10. Optimizing Images – Stoyan Stefanov and Nicole Sullivan
  11. Sharding Dominant Domains
  12. Flushing the Document Early
  13. Using Iframes Sparingly
  14. Simplifying CSS Selectors
  15. Performance Tools

Between now and when the book comes out, I’ll write a blog post about each of my chapters. I wrote the first one of these, Split the Initial Payload, back in May. Now that I have more time on my hands, I’ll catch up and finish the rest.

If you’re just beginning the process of improving your web site’s performance, you should start with High Performance Web Sites. But as Web 2.0 gains wider adoption and the amount of content on web pages continues to grow, the best practices in Even Faster Web Sites are key to making today’s web sites fast(er).

12 Comments

O’Reilly Master Class

March 3, 2009 12:11 pm | 2 Comments

O’Reilly, my publisher, has launched a new initiative to bring a deeper level of information and engagement around their titles and technology focus areas. I’m excited to be a part of this by leading a one day workshop on Creating Higher Performance Web Sites. The workshop (or “Master Class” as they call it) is March 30, 9am-5pm, at the Mission Bay Conference Center in SF. The cost is $600 ($550 if you register before March 15).

This is going to be an engaging and fact-filled day for developers who care about web performance. I’m going to go over the best practices from my first book, High Performance Web Sites, which are also captured in YSlow. But I’m also going to touch on the chapters from my next book including loading scripts without blocking, flushing the document early, and using iframes sparingly. I’m just wrapping up these chapters now, so these are new insights “hot off the presses”. Since it’s a workshop, O’Reilly wants a fair amount of audience involvement, so I’m working up a few exercises to give attendees experience analyzing web pages to find performance bottlenecks.

There are also workshops from Doug Crockford (JavaScript: The Good Parts), Scott Berkun (Leading and Managing Breakthrough Projects), and Jonathan Zdziarski (iPhone Forensics: Recovering Evidence, Personal Data, and Corporate Assets). These workshops come right before Web 2.0 Expo in SF, so it’s a great doubleheader. I hope to see you there.

2 Comments

User Agents in the morning

January 18, 2009 5:25 pm | 17 Comments

Every working day, a script runs at 7am that opens ~20 websites in my browser. I open them at 7am so that they’re ready for me when I sit down with my coffee. I’m the performance guy – I can’t stand waiting for a page to load. Among the sites that I read everyday are blogs (Ajaxian, O’Reilly Radar, Google Reader for the rest), news sites (MarketWatch, CNET Tech News, InternetNews, TheStreet.com), and stuff for fun and life (Dilbert, Woot, The Big Picture, Netflix).

The last site is a page related to UA Profiler. It lists all the new user agents that have been tested in the last day. These are unique user agents – they’ve never been seen by UA Profiler before. When I first launched UA Profiler, there were about 50 each day. Now, it’s down to about 20 per day. But I’ve skipped over the main point.

Why do I review these new user agents every morning?

When I started UA Profiler, I assumed I would be able to find a library to accurately parse the HTTP User-Agent string into its components. I need this in order to categorize the test results. Was the test done with Safari or iPhone? Internet Explorer or Maxthon? NetNewsWire or OmniWeb? My search produced some candidates, but none of them had the level of accuracy I wanted, unable to properly classify edge case browsers, mobile devices, and new browsers (like Chrome and Android).

So, I rolled my own.

I find that it’s very accurate – more accurate than anything else I could find. Another good site out there is UserAgentString.com, but even they misclassify some well known browsers such as iPhone, Shiretoko, and Lunascape. When I do my daily checks I find that every 200-400 new user agents requires me to tweak my code. And I’ve written some good admin tools to do this check – it only takes 5 minutes to complete. And the code tweaks, when necessary, take less than 15 minutes.

It’s great that this helps UA Profiler, but I’d really like to share this with the web community. The first step was adding a new Parse User-Agent page to UA profiler. You can paste any User-Agent string and see how my code classifies it. I also show the results from UserAgentString.com for comparison. The next steps, if there’s interest and I can find the time, would be to make this available as a web service and make the code available, too. What do people think?

  • Do other people share this need for better User Agent parsing?
  • Do you know of something good that’s out there that I missed?
  • Do you see gaps or mistakes in UA Profiler’s parsing?

For now, I’ll keep classifying user agents as I finish the last drops of my (first) coffee in the morning.

17 Comments

UA Profiler improvements

December 19, 2008 11:28 pm | 1 Comment

UA Profiler is the tool I released 3 months ago that tracks the performance traits of various browsers. It’s a community-driven project – as more people use it, the data has more coverage and accuracy. So far, 7000 people have run 10,000 tests across 150 different browser versions (2500 unique User Agents). Over the past week (since my Stanford class ended), I’ve been adding some requested improvements.

drilldown

Previously, I had one label for a browser. For example, Firefox 3.0 and 3.1 results were all lumped under “Firefox 3″. This week I added the ability to drilldown to see more detailed data. The results can be viewed in five ways:

  1. Top Browsers – The most popular browsers as well as major new versions on the horizon.
  2. Browser Families – The full list of unique browser names: Android, Avant, Camino, Chrome, etc.
  3. Major Versions – Grouped by first version number: Firefox 2, Firefox 3, IE 6, IE7, etc.
  4. Minor Versions – Grouped by first and second version numbers: Firefox 3.0, Firefox 3.1, Chrome 0.2, Chrome 0.3, etc.
  5. All Versions – At most I save three levels of version numbers. Here you can see Firefox 3.0.1, Firefox 3.0.2, Firefox 3.0.3, etc.
hiding sparse data
The result tables grew lengthy due to unusual User Agent strings with atypical version numbers. These might be the result of nightly builds or manual tweaking of the User Agent. Now, I only show browsers tested by at least two different people a total of four or more times. If you want to see all browsers, regardless of the amount of testing, check “show sparse results” at the top.
individual tests
Several people asked to see the individual test results, that is, each test that was run for a certain browser. There were several motivations: Was there much variation for test X? What were the exact User Agent strings that were matched to this browser? When were the tests done (because that problem was fixed on such-and-such a date)? When looking at a results table, clicking on the Browser name will open a new table that shows the results for each test under that browser.
sort
Once I sat down to do it, it took me ~5 minutes to make the results table sortable using Stuart Langridge’s sorttable. Now you can sort to your heart’s content. (This weekend I’ll write a post about how I made his code work when loaded asynchronously using a variation of John Resig’s Degrading Script Tags pattern.)

UA Profiler has been quite successful, gathering a consistent amount of testing each day. I especially enjoy seeing people running nightly builds against it. It’s fun to look at the individual tests and get some visibility into how a browser’s code base evolves. For example, looking at the details for Chrome 1.0, we see that Chrome 1.0.154 failed the “Parallel Scripts” test, but Chrome 1.0.155 passed. Looking at the User Agent strings we see that Chrome 1.0.154 was built using WebKit 525, whereas Chrome 1.0.155 upgraded to WebKit 528. The upgrade in WebKit version was the key to attaining this important performance trait.

I also know of at least one case of a browser regression that the development team fixed because it was flagged by UA Profiler. This is an amazing side effect of tracking these performance traits – actually helping browser teams improve the performance of their browsers, the browsers that you and I use every day. My next task is to improve the tests in UA Profiler. I’ll work on that. Your job is to keep running your favorite browser (and mobile web device!) through the UA Profiler test suite to highlight what’s being done right, and what more is needed to make our web experience fast by default.

1 Comment

CS193H: final exam

December 16, 2008 10:35 pm | 7 Comments

This past quarter I’ve been teaching CS193H: High Performance Web Sites at Stanford. Last week was the final exam and tonight I finished submitting the grades. (The average grade was 88.) This was a great experience. Stanford is an inspirational place to be. The students are very smart – one of the undergraduates in my class is already working with a VC up on Sandhill Road. As I’ve found before, I learn a lot when I teach. This was especially true given the questions from these students. I’ve never taught an entire quarter before. Teaching three classes per week while developing a new curriculum took a lot of time. I’m thankful to Google for giving me time to do this.

The material from the class is posted on the class web site. Slides from all of my lectures, including material from my next book, can be found there. There are also slides from my amazing guest lecturers:

I’ve posted the midterm and final exams, along with the ans-wers. The slides and tests provide a thorough coverage of web performance. The average grade on the final was 94. Give it a whirl and let me know how you do.

7 Comments

CACM article: High Performance Web Sites

December 15, 2008 11:28 pm | 2 Comments

Last summer I attended the ACM Awards Banquet. (I talked about this in my blog post about how Women are Geeks (too!).) Out of that came a request for me to write an article on web performance. The article is called “High Performance Web Sites”. [gasp!] It’s a review of the rules from my first book, plus a preview of the first three rules from my next book. The article came out last week in two of the ACM’s magazines: Communications of the ACM and Queue.

Communications of the ACM (CACM) is their flagship publication. It’s been around for more than 50 years with an audience of 85,000 readers. It’s a hardcopy magazine, so the link above is a preview copy of my article. There are other versions including HTML and PDF.

Queue is ACM’s online magazine focusing on software development. The issue of Queue containing my article is about Scalable Web Services. Other articles include:

(I didn’t realize I was going to be in with such heavy company.) I’m happy with this article – it’s short, but provides a good overview of my performance best practices and has a bit of evangelism at the close. Give it a read and recommend it to any colleagues who are entering the performance arena.

2 Comments