HAR to Page Speed

May 1, 2010 9:48 pm | 11 Comments

Here’s the story behind this nifty tool I cranked out this weekend: HAR to Page Speed

HTTP Archive Specification

About a year ago I was on the weekly Firebug Working Group call when Jan (“Honza”) Odvarko said he was going to work on an export feature for Net Panel. I love HttpWatch and had used its export feature many times, but always wished there was an industry standard for saving HTTP waterfall chart information. In the hope of achieving this goal, I introduced Honza and Simon Perkins (creator of HttpWatch) and suggested that if they developed an open format it would likely evolve into an industry standard.

A few months later they published the HTTP Archive specification and had integrated it into their products. My contribution? In addition to planting the idea with Honza and Simon, I chose the three character file extension: .HAR. Support for HAR is growing. In addition to being part of Firebug (via Honza’s NetExport add-on) and HttpWatch, it’s also in ShowSlow, DebugBar, Http Archive Rule Runner, and a few other tools and sites out there. (I hear it’s coming to Fiddler soon.)

The importance of an industry standard HTTP archive format is huge. Adoption of HAR allows companies and data gathering institutions (such as the Internet Archive) to record the web page experience and pull it up later for further review. It provides a way to exchange information across tools. And it provides an open standard for sharing web loading information between individuals – developer to developer as well as customer to customer support.

Page Speed SDK

In their last few releases the Page Speed team has mentioned porting their performance analysis logic from JavaScript to C++. The resulting library is called “native library” – not too jazzy. But last week they released the Page Speed SDK. The documentation is slim, but I noticed a commandline tool called har_to_pagespeed.

Hmmm, that sounds interesting.

I downloaded the SDK. It built fine on my Dreamhost shared server. Then I wrapped it with a file upload PHP page and created HAR to Page Speed.

You start by uploading a HAR file. If you don’t have any or simply want a quick test drive, you can use one of the examples. But it’s easy to create your own HAR files using Firebug and NetExport. The latter adds the “Export” item to Firebug’s Net Panel.

Now comes the fun part. After uploading a HAR file you get the output from Page Speed. (Note that this is a subset of rules. Some rules still need to be ported.)

I also threw in a rendering of the waterfall chart based on Honza’s HarViewer:

Compellingness

My HAR to Page Speed page is handy. If you’re generating HAR files in something other than Firefox, you now have a way to get a Page Speed analysis. If you’ve got an archive of HAR files, you can analyze them with Page Speed at any time in the future.

But the big excitement I get from this page is to see these pieces coming together, especially in the area of performance analysis. Another industry initiative I’ve been advocating is a common performance analysis standard. Right now we have multiple performance analysis tools: Page Speed, YSlow, AOL Pagetest, MSFast, VRTA, and neXpert to name a few. There’s some commonality across these tools, but the differences are what’s noticeable. Web developers really need to run multiple tools if they want their web site to be evaluated against the most important performance best practices.

With the adoption of HAR and Page Speed SDK, we’re moving to having a record of the page load experience that can be saved and shared, and performance analysis that is consistent regardless of what browser and development environment you work in. We’re not quite there. We need more tools to adopt HAR import/export. And we need more rules to be added to the Page Speed SDK. But I can see the handwriting on the wall – and it’s spelling F-A-S-T.

I’ll be talking about these and other movements in the performance industry this Wednesday at Web 2.0 Expo SF.

11 Responses to HAR to Page Speed

  1. Thanks for sharing this tool. From an analysis perspective it is certainly advantageous to be able to compare results across browsers against one set of performance criteria.

    That said, I’m curious about the part of this post where you advocate for standardization in performance analysis itself. The tools cited here (Page Speed, YSlow, etc.) offer visibility to raw data, but they also synthesize that information into subjective “grades.” I find this layer of abstraction a little superfluous. It can provide a shortcut to performing real analysis.

    If you couple that with the idea that these systems could develop into standards, don’t you risk advocating for less transparency in the field of performance analysis itself? Wouldn’t it be more flexible and transparent to allow for a diverse set of tools used to collect data and encourage analysis to always speak to raw, quantifiable metrics? I would hate to see performance analysis reduced to Olympic style comparison of “scores” regardless of which tool generates them. Thoughts?

  2. @Jennifer: Awesome, thoughtful comment. We want both: As an industry, we need a common yardstick so we get apples-to-apples comparison. But we also want innovation. My thought is there would be a common formula for a canonical performance “score”, but tools could go beyond that to dig into other areas. To your other point about raw data – HAR facilitates this by decoupling the gathering of data from the analysis. With HAR, analysis doesn’t have to occur live in the browser – it can be done later (even retroactively) using the HAR file.

  3. Thanks for this article. List time seen some of those plugins mentioned. Added a few more to my testing suite, thank you

  4. Are there any commandline tools (Linux compatible preferably) to generate HAR files? It would be nice to be able to run a cronjob and check a site every day and generate charts from that, similar to Pingdom and other services.

  5. @Jacob: For realistic results, you need to run a headless browser to generate a HAR file. This is possible, but goes beyond a simple commandline tool. Sergey most recently talked about how he does this with xvfb: http://www.sergeychernyshev.com/blog/automating-yslow-and-pagespeed-using-xvfb/

  6. FYI, I just pushed the code to support exporting HAR files from WebPagetest as well. It’s available as a link in the top-right corner of the summary results as well as on the waterfall page.

    Sadly WebPagetest doesn’t keep the full responses around (just the headers) so PageSpeed can’t do very much with it.

    Since I’ll be working on pagetest anyway to integrate with the PageSpeed libraries I’ll see if I can add HAR export capabilities there as well which would allow for full responses to be included.

    Where things could get really interesting will be when I can get time to add HAR import capabilities to WebPagetest.

  7. Excellent remark by Jennifer Showe. And thanks for that link to Sergey Chernyshev’s blog post, Steve.

    I will definitely try to use this in my master thesis!

  8. Very cool – I wonder if anyone has implemented hardiff yet?
    It would be useful even if the output were simply text/json, but I can dream about a waterfall chart that compares two runs – maybe like the ghost driver some racing games show overlayed on your current lap.

  9. @Jennifer when I’m looking at scores, I see some way of quantifying best practices and providing tools that are independent from the location or CPU power of the computer that does measurements – this is very useful and actually not that easy to achieve. That being said, I believe, that understanding “precise” measurements is also important and it’d be great if there was some data about acceptable DNS speeds or time to draw/load events and so on.

    @Steve it’s just amazing how SDK can be wrapped up into a useful tool this quickly. It means that tools are here, great job, industry!

    @Pat that was quick too! ;)

  10. Very happy to see more applications adopting the HAR format. One thing I would like to see is the ability to capture this information via tcpdump or Wireshark.

  11. My thoughts ran on a bit, so I put them here:
    http://www.mnot.net/blog/2010/05/05/har

    Cheers,