OSCON and Page Responsiveness videos

August 15, 2009 5:01 pm | 1 Comment

I had a great time at OSCON a few weeks back. It was in San Jose this year. (Pro: I don’t have to travel and my wife can go to the parties. Con: I miss Portland.) Just as I wrote about last year, Gregg Pollack was there asking speakers to summarize their talks in 30 seconds. He published the results in the video series 5 Days of OSCON. I’m in the video for Day 3.

Gregg also pointed me to his Page Responsiveness webcast/video, where he talks about YSlow and the Google Ajax Libraries API. I really like this video. It’s informative, engaging, and short. They remind me of Aza Raskin’s webcasts on Ubiquity and Jetpack. These two guys are very talented in how they convey complex information in a hands-on way. I encourage you to take a look.

1 Comment

OmniTI and performance koolaid

July 28, 2009 11:17 pm | Comments Off on OmniTI and performance koolaid

In YSlow! to YFast! in 45 minutes, Theo Schlossnagle (CEO of OmniTI) delivers a play-by-play about how he made his corporate web site 35% faster. The amazing revelation in his commentary is that he was able to complete all of these improvements while sitting in my workshop at Velocity (ahem).

OmniTI is a full service web house, specializing in web performance and scalability. The irony of the fact that their corp web site received a YSlow “F” wasn’t wasted on Theo. The cobbler’s children syndrome. (Same is true on my web site – I’ve got to optimize WordPress!)

Theo walks through his changes one-by-one: adding a far future Expires header, removing ETags, compressing text responses especially scripts and stylesheets, and moving resources to a CDN without cookies. With less than 45 minutes work, his site went from a load time of 486 milliseconds down to 315 milliseconds.

There’s more low hanging fruit – consolidating scripts, consolidating stylesheets, and CSS sprites. But it’s great to get this early case study on specific improvements and their corresponding impact on performance. I hope he’ll share the results from the next wave of optimizations.

Comments Off on OmniTI and performance koolaid

Hammerhead: moving performance testing upstream

September 30, 2008 10:07 pm | 54 Comments

Today at The Ajax Experience, I released Hammerhead, a Firebug extension for measuring page load times.

Improving performance starts with metrics. How long does it take for the page to load? Seems like a simple question to answer, but gathering accurate measurements can be a challenge. In my experience, performance metrics exist at four stages along the development process.

  • real user data – I love real user metrics. JavaScript frameworks like Jiffy measure page load times from real traffic. When your site is used by a large, diverse audience, data from real page views is ground-truth.
  • bucket testing – When you’re getting ready to push a new release, if you’re lucky you can do bucket testing to gather performance metrics. You release the new code to a subset of users while maintaining another set of users on the old code (the “control”). If you sample your user population correctly and gather enough data, comparing the before and after timing information gives you a preview of the latency impact of your next release.
  • synthetic or simulated testing – In some situations, it’s not possible to gather real user data. You might not have the infrastructure to do bucket testing and real user instrumentation. Your new build isn’t ready for release, but you still want to gauge where you are with regard to performance. Or perhaps you’re measuring your competitors’ performance. In these situations, the typical solution is to do scripted testing on some machine in your computer lab, or perhaps through a service like Keynote or Gomez.
  • dev box – The first place performance testing happens (or should happen) is on the developer’s box. As she finishes her code changes, the developer can see if she made things better or worse. What was the impact of that JavaScript rewrite? What happens if I add another stylesheet, or split my images across two domains?

Performance metrics get less precise as you move from real user data to dev box testing, as shown in Figure 1. That’s because, as you move away from real user data, biases are introduced. For bucket testing, the challenge is selecting users in an unbiased way. For synthetic testing, you need to choose scenarios and test accounts that are representative of real users. Other variables of your real user audience are difficult or impossible to simulate: bandwidth speed, CPU power, OS, browser, geographic location, etc. Attempting to simulate real users in your synthetic testing is a slippery, and costly, slope. Finally, testing on the dev box usually involves one scenario on a CPU that is more powerful than the typical user, and an Internet connection that is 10-100 times faster.

Figure 1 - Precision and ability to iterate along the development process

Given this loss of precision, why would we bother with anything other than real user data? The reason is speed of development. Dev box data can be gathered within hours of a code change, whereas it can take days to gather synthetic data, weeks to do bucket testing, and a month or more to release the code and have real user data. If you wait for real user data to see the impact of your changes, it can take a year to iterate on a short list of performance improvements. To quickly identify the most important performance improvements and their optimal implementation, it’s important to improve our ability to gather performance metrics earlier in the development process: on the dev box.

As a developer, it can be painful to measure the impact of a code change on your dev box. Getting an accurate time measurement is the easy part; you can use YSlow, Fasterfox, or an alert dialog. But then you have to load the page multiple times. The most painful part is transcribing the load times into Excel. Were all the measurements done with an empty cache or a primed cache, or was that even considered?

Hammerhead makes it easier to gather performance metrics on your dev box. Figure 2 shows the results of hammering a few news web sites with Hammerhead. By virtue of being a Firebug extension, Hammerhead is available in a platform that web developers are familiar with. To setup a Hammerhead test, one or more URLs are added to the list, and the “# of loads” is specified. Once started, Hammerhead loads each URL the specified number of times.

Figure 2 - Hammerhead results for several news sites

Figure 2 - Hammerhead results for a few news web sites

The next two things aren’t rocket science, but they make a big difference. First, there are two columns of results, one for empty cache and one for primed cache. Hammerhead automatically clears the disk and memory cache, or just the memory cache, in between each page load to gather metrics for both of these scenarios. Second, Hammerhead displays the median and average time measurement. Additionally, you can export the data in CSV format.

Even if you’re not hammering a site, other features make Hammerhead a useful add-on. The Cache & Time panel, shown in Figure 3, shows the current URL’s load time. It also contains buttons to clear the disk and memory cache, or just the memory cache. It has another feature that I haven’t seen anywhere else. You can choose to have Hammerhead clear these caches after every page view. This is a nice feature for me when I’m loading the same page again and again to see it’s performance in an empty or a primed cache state. If you forget to switch this back, it gets reset automatically next time you restart Firefox.

Figure 3 - Cache & Time panel in Hammerhead

Figure 3 - Cache & Time panel in Hammerhead

If you don’t have Hammerhead open, you can still see the load time in the status bar. Right clicking the Hammerhead icon gives you access for clearing the cache. The ability to clear just the memory cache is another valuable feature I haven’t seen elsewhere. I feel this is the best way to simulate the primed cache scenario, where the user has been to your site recently, but not during the current browser session.

Hammerhead makes it easier to gather performance metrics early in the development process, resulting in a faster development cycle. The biggest bias is that most developers have a much faster bandwidth connection than the typical user. Easy-to-install bandwidth throttlers are a solution. Steve Lamm blogged on Google Code about how Hammerhead can be used with bandwidth throttlers on the dev box, bringing together both ease and greater precision of performance measurements. GIve it a spin and let me know what you think.

54 Comments

Revving Filenames: don’t use querystring

August 23, 2008 10:51 am | 13 Comments

It’s important to make resources (images, scripts, stylesheets, etc.) cacheable so your pages load faster when users come back. A recent study from Website Optimization shows that the top 1000 home pages average over 50 resources per page! Being able to skip 50 HTTP requests and just read those resources from the browser’s cache dramatically speeds up pages.

This is covered in my book (High Performance Web Sites) and YSlow by Rule 3: Add an Expires Header. It’s easy to make your resources cacheable – just add an Expires HTTP response header with a date in the future. You can do this in your Apache configuration like this:

<FilesMatch "\.(gif|jpg|js|css)$">
  ExpiresActive On
  ExpiresDefault "access plus 10 years"
</FilesMatch>

That part is easy. The hard part is revving your resource filenames when you make a change. If you make mylogo.gif cacheable for 10 years and then publish a modified version of this file to your servers, users with the old version in their cache won’t get the update. The solution is to rev the name, perhaps by including the file’s timestamp or version number in the URL. But which is better: mylogo.1.2.gif or mylogo.gif?v=1.2? To gain the benefit of caching by popular proxies, avoid revving with a querystring and instead rev the filename itself. (more…)

13 Comments

OSCON: 34 hours in 37 minutes

July 29, 2008 1:14 pm | Comments Off on OSCON: 34 hours in 37 minutes

I was in Portland for OSCON last week. There were many talks that attracted my attention – so many that I couldn’t get to them all. If you missed some talks, or didn’t make it to OSCON, check out this great effort capturing Oscon in 37 minutes. Gregg Pollack asked 45 speakers to summarize their talk in 30 seconds. Most people took longer (37 * 60 / 45 = 49.3 seconds), but still, to get a taste of 45 sessions in a 37 minute video is pretty awesome. If you had attended each session it would’ve taken over 34 hours! You can jump straight to the segment for any speaker (here’s mine), and links to each speaker’s slides are displayed.

Comments Off on OSCON: 34 hours in 37 minutes

YUI’s Combo Handler CDN Service

July 17, 2008 10:33 am | 3 Comments

Eric Miraglia wrote a post yesterday called Combo Handler Service Available for Yahoo-hosted JS. One of the advantages of YUI over other JavaScript frameworks is its à la carte capabilities. Developers can choose just the parts they want, rather than being saddled with the whole kit and caboodle. It’s great to download fewer bytes, but choosing a subset of modules results in downloading multiple external scripts, something that’s bad for performance and costs YSlow points. That’s where Eric’s post comes in.

The Combo Handler Service lets developers choose a customized subset of modules and have them served as a single HTTP request from Yahoo!’s worldwide CDN for faster delivery. Each module (file) is listed in the querystring. As an example, Eric shows that loading the YUI Rich Text Editor the old way would require downloading six separate scripts:

<script src=”http://yui.yahooapis.com/2.5.2/build/yahoo-dom-event/yahoo-dom-event.js”>
</script>
<script src=”http://yui.yahooapis.com/2.5.2/build/container/container_core-min.js”>
</script>
<script src=”http://yui.yahooapis.com/2.5.2/build/menu/menu-min.js”>
</script>
<script src=”http://yui.yahooapis.com/2.5.2/build/element/element-beta-min.js””>
</script>
<script src=”http://yui.yahooapis.com/2.5.2/build/button/button-min.js”>
</script>
<script src=”http://yui.yahooapis.com/2.5.2/build/editor/editor-beta-min.js”>
</script>

This is reduced to a single HTTP request by using the Combo Handler:

<script src=”http://yui.yahooapis.com/combo?2.5.2/build/yahoo-dom-event/yahoo-dom-event.js&
2.5.2/build/container/container_core-min.js&2.5.2/build/menu/menu-min.js&
2.5.2/build/element/element-beta-min.js&2.5.2/build/button/button-min.js&
2.5.2/build/editor/editor-beta-min.js”>
</script>

It’s important that developers using Combo Handler pay particular attention that they get the dependency order correct. I created the YUI Combo Handler: preserving order test page to show what happens if prerequisites are listed incorrectly. In this page instead of putting editor-beta-min.js as the last querystring parameter, I make it the first parameter. Not surprisingly, the page produces JavaScript errors. It would be great if Combo Handler ensured that the response had all the necessary prerequisites in the right order. With this enhancement the single request would be much simpler:

<script src=”http://yui.yahooapis.com/combo?2.5.2/build/editor/editor-beta-min.js”>
</script>

This is a great announcement. Web developers using YUI can now get a customized rollup hosted on Yahoo!’s CDN! This is the best of all worlds – reduced download size, fewer HTTP requests, and CDN hosting. I encourage other JavaScript frameworks to adopt YUI’s à la carte flexibility. Perhaps Google Ajax Libraries could then support features to serve up customized builds similar to what Combo Handler does. Well done.

3 Comments

How green is your web page?

March 6, 2008 10:03 pm | 29 Comments

Writing faster web pages is great for your users, which in turn is great for you and your company. But it’s better for everyone else on the planet, too.

Intrigued by an article on Radar about co2stats.com, I looked at my web performance best practices from the perspective of power consumption and CO2 emissions. YSlow grades web pages according to how well they follow these best practices. What if it could convert those grades into kilowatt-hours and pounds of CO2?

(more…)

29 Comments