Resource Timing practical tips

August 21, 2014 12:45 am | 9 Comments

The W3C Web Performance Working Group brought us Navigation Timing in 2012 and it’s now available in nearly every major browser. Navigation Timing defines a JavaScript API for measuring the performance of the main page. For example:

// Navigation Timing
var t = performance.timing,
    pageloadtime = t.loadEventStart - t.navigationStart,
    dns = t.domainLookupEnd - t.domainLookupStart,
    tcp = t.connectEnd - t.connectStart,
    ttfb = t.responseStart - t.navigationStart;

Having timing metrics for the main page is great, but to diagnose real world performance issues it’s often necessary to drill down into individual resources. Thus, we have the more recent Resource Timing spec. This JavaScript API provides similar timing information as Navigation Timing but it’s provided for each individual resource. An example is:

// Resource Timing
var r0 = performance.getEntriesByType("resource")[0],
    loadtime = r0.duration,
    dns = r0.domainLookupEnd - r0.domainLookupStart,
    tcp = r0.connectEnd - r0.connectStart,
    ttfb = r0.responseStart - r0.startTime;

As of today, Resource Timing is supported in Chrome, Chrome for Android, Opera, IE10, and IE11. This is likely more than 50% of your traffic so should provide enough data to uncover the slow performing resources in your website.

Using Resource Timing seems straightforward, but when I wrote my first production-quality Resource Timing code I ran into several issues that I want to share. Here are my practical tips for tracking Resource Timing metrics in the real world.

1. Use getEntriesByType(“resource”) instead of getEntries().

You begin using Resource Timing by getting the set of resource timing performance objects for the current page. Many Resource Timing examples use performance.getEntries() to do this, which implies that only resource timing objects are returned by this call. But getEntries() can potentially return four types of timing objects: “resource”, “navigation”, “mark”, and “measure”.

This hasn’t caused problem for developers so far because “resource” is the only entry type in most pages. The “navigation” entry type is part of Navigation Timing 2, which isn’t implemented in any browser AFAIK. The “mark” and “measure” entry types are from the User Timing specification which is available in some browsers but not widely used.

In other words, getEntriesByType("resource") and getEntries() will likely return the same results today. But it’s possible that getEntries()  will soon return a mix of performance object types. It’s best to use performance.getEntriesByType("resource") so you can be sure to only retrieve resource timing objects. (Thanks to Andy Davies for explaining this to me.)

2. Use Nav Timing to measure the main page’s request.

When fetching a web page there is typically a request for the main HTML document. However, that resource is not returned by performance.getEntriesByType("resource"). To get timing information about the main HTML document you need to use the Navigation Timing object (performance.timing).

Although unlikely, this could cause bugs when there are no entries. For example, my earlier Resource Timing example used this code:

performance.getEntriesByType("resource")[0]

If the only resource for a page is the main HTML document, then getEntriesByType("resource") returns an empty array and referencing element [0] results in a JavaScript error. If you don’t have a page with zero subresources then you can test this on http://fast.stevesouders.com/.

3. Beware of issues with secureConnectionStart.

The secureConnectionStart property allows us to measure how long it takes for SSL negotiation. This is important – I often see SSL negotiation times of 500ms or more. There are three possible values for secureConnectionStart:

  • If this attribute is not available then it must be set to undefined.
  • If HTTPS is not used then it must be set to zero.
  • If this attribute is available and HTTPS is used then it must be set to a timestamp.

There are three things to know about secureConnectionStart. First, in Internet Explorer the value of secureConnectionStart is always “undefined” because it’s not available (the value is buried down inside WinINet).

Second, there’s a bug in Chrome that causes secureConnectionStart to be incorrectly set to zero. If a resource is fetched using a pre-existing HTTPS connection then secureConnectionStart is set to zero when it should really be a timestamp. (See bug 404501 for the full details.) To avoid skewed data make sure to check that secureConnectionStart is neither undefined nor zero before measuring the time for SSL negotiation:

var r0 = performance.getEntriesByType("resource")[0];
if ( r0.secureConnectionStart ) {
    var ssl = r0.connectEnd - r0.secureConnectionStart;
}

Third, the spec is misleading with regard to this line: “…if the scheme of the current page is HTTPS, this attribute must return the time immediately before the user agent starts the handshake process…” (emphasis mine). It’s possible for the current page to be HTTP and still contain HTTPS resources for which we want to measure the SSL negotiation. The spec should be changed to “…if the scheme of the resource is HTTPS, this attribute must return the time immediately before the user agent starts the handshake process…”. Fortunately, browsers behave according to the corrected language, in other words, secureConnectionStart is available for HTTPS resources even on HTTP pages.

4. Add “Timing-Allow-Origin” to cross-domain resources.

For privacy reasons there are cross-domain restrictions on getting Resource Timing details. By default, a resource from a domain that differs from the main page’s domain has these properties set to zero:

  • redirectStart
  • redirectEnd
  • domainLookupStart
  • domainLookupEnd
  • connectStart
  • connectEnd
  • secureConnectionStart
  • requestStart
  • responseStart

There are some cases when it’s desirable to measure the performance of cross-domain resources, for example, when a website uses a different domain for its CDN (such as “youtube.com” using “s.ytimg.com”) and for certain 3rd party content (such as “ajax.googleapis.com”). Access to cross-domain timing details is granted if the resource returns the Timing-Allow-Origin response header. This header specifies the list of (main page) origins that are allowed to see the timing details, although in most cases a wildcard (“*”) is used to grant access to all origins. As an example, the Timing-Allow-Origin response header for http://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js is:

Timing-Allow-Origin: *

It’s great when 3rd parties add this response header; it allows website owners to measure and understand the performance of that 3rd party content on their pages. As reported by (and thanks to) Ilya Grigorik, several 3rd parties have added this response header. Here are some example resources that specify “Timing-Allow-Origin: *”:

When measuring Resource Timing it’s important to determine if you have access to the restricted timing properties. You can do that by testing whether any of the restricted properties (listed above) (except for secureConnectionStart) is zero. I always use requestStart. Here’s a better version of the earlier code snippet that checks that the restricted properties are available before calculating the more detailed performance metrics:

// Resource Timing
var r0 = performance.getEntriesByType("resource")[0],
    loadtime = r0.duration;
if ( r0.requestStart ) {
    var dns = r0.domainLookupEnd - r0.domainLookupStart,
        tcp = r0.connectEnd - r0.connectStart,
        ttfb = r0.responseStart - r0.startTime;
}
if ( r0.secureConnectionStart ) {
    var ssl = r0.connectEnd - r0.secureConnectionStart;
}

It’s really important that you do these checks. Otherwise, if it’s assumed you have access to these restricted properties you won’t actually get any errors; you’ll just get bogus data. Since the values are set to zero when access is restricted, things like “domainLookupEnd – domainLookupStart” translate to “0 – 0” which returns a plausible result (“0”) but is not necessarily the true value of the (in this case) DNS lookup. This results in your metrics having an overabundance of “0” values which incorrectly skews the aggregate stats to look rosier than they actually are.

5. Understand what zero means.

As mentioned in #4, some Resource Timing properties are set to zero for cross-domain resources that have restricted access. And again, it’s important to check for that state before looking at the detailed properties. But even when the restricted properties are accessible it’s possible that the metrics you calculate will result in a value of zero, and it’s important to understand what that means.

For example, (assuming there are no access restrictions) the values for domainLookupStart and domainLookupEnd are timestamps. The difference of the two values is the time spent doing a DNS resolution for this resource. Typically, there will only be one resource in the page that has a non-zero value for the DNS resolution of a given hostname. That’s because the DNS resolution is cached by the browser; all subsequent requests use that cached DNS resolution. And since DNS resolutions are cached across web pages, it’s possible to have all DNS resolution calculations be zero for an entire page. Bottomline: a zero value for DNS resolution means it was read from cache.

Similarly, the time to establish a TCP connection (“connectEnd – connectStart”) for a given hostname will be zero if it’s a pre-existing TCP connection that is being re-used. Each hostname should have ~6 unique TCP connections that should show up as 6 non-zero TCP connection time measurements, but all subsequent requests on that hostname will use a pre-existing connection and thus have a TCP connection time that is zero. Bottomline: a zero value for TCP connection means a pre-existing TCP connection was re-used.

The same applies to calculating the time for SSL negotiation (“connectEnd – secureConnectionStart”). This might be non-zero for up to 6 resources, but all subsequent resources from the same hostname will likely have a SSL negotiation time of zero because they use a pre-existing HTTPS connection.

Finally, if the duration property is zero this likely means the resource was read from cache.

6. Determine if 304s are being measured.

There’s another bug in Chrome stable (version 36) that has been fixed in version 37. This issue is going away, but since most users are on Chrome stable your current metrics are likely different than they are in reality. Here’s the bug: a cross-origin resource that does have Timing-Allow-Origin on a 200 response will not have that taken into consideration on a 304 response. Thus, 304 responses will have zeroes for all the restricted properties as shown by this test page.

This shouldn’t happen because the Timing-Allow-Origin header from the cached 200 response should be applied (by the browser) to the 304 response. This is what happens in Internet Explorer. (Try the test page in IE 10 or 11 to confirm.) (Thanks to Eric Lawrence for pointing this out.)

This impacts your Chrome Resource Timing results as follows:

  • If you’re checking that the restricted fields are zero (as described in #4), then you’ll skip measuring these 304 responses. That means you’re only measuring 200 responses. But 200 responses are slower than 304 responses, so your Resource Timing aggregate measurements are going to be larger than they are in reality.
  • If you’re not checking that the restricted fields are zero, then you’ll record many zero values which are faster than the 304 response was in reality, and your Resource Timing stats will be too optimistic.

There’s no easy way to avoid these biases, so it’s good news that the bug has been fixed. One thing you might try is to send a Timing-Allow-Origin on your 304 responses. Unfortunately, the popular Apache web server is unable to send this header in 304 responses (see bug 51223). Further evidence of the lack of Timing-Allow-Origin in 304 responses can be found by looking at the 3rd party resources listed in #4. As stated, it’s great that these 3rd parties return that header on 200 responses, but 4 out of 5 of them do not return Timing-Allow-Origin on 304 responses. Until Chrome 37 becomes stable it’s likely that Resource Timing metrics are skewed either too high or too low because of the missing details for the restricted properties. Fortunately, the value for duration is accurate regardless.

7. Look at Boomerang.

If you’re thinking of writing your own Resource Timing code you should first look at the Resource Timing plugin for Boomerang. (The code is on GitHub.) Boomerang is the popular open source RUM package maintained by Philip Tellis. He originally open sourced it when he was at Yahoo! but is now providing ongoing maintenance and enhancements as part of his work at SOASTA for the commercial version (mPulse). The code is clear, pithy, and robust and addresses many of the issues mentioned above.

In conclusion, Navigation Timing and Resource Timing are outstanding new specifications that give website owners much more visibility into the performance of their web pages. Resource Timing is the newer of the two specs, and as such there are still some wrinkles to be ironed out. These tips will help you get the most from your Resource Timing metrics. I encourage you to start tracking these metrics today to understand how your website is performing for the people who matter the most: real users.

Update: Here are some other 3rd party resources that include the Timing-Allow-Origin response header thus allowing website owners to measure their performance of this 3rd party content:

Update: Note that getEntriesByType(“resource”) and getEntries() do not include resources inside iframes. If the iframe is from the same origin, then the parent window can access these by using the iframe’s contentWindow.performance object.

9 Responses to Resource Timing practical tips

  1. Great tips! I just want to add that any font assets served by Typekit also have the “Timing-Allow-Origin: *” header set.

  2. Great post! This is a terrific start to a much needed discussion around resource timing. I hope we can all chat about this at WebPerfDays NY!

  3. @Bram: It’s great that Typekit includes this header! I added an update mentioning additional 3rd parties who specify Timing-Allow-Origin.

  4. As far as I know the other types that could be used in getEntriesByType are ‘mark’, ‘measure’, and ‘navigation’

    ‘mark’ and ‘measure’ are from User Timing (http://www.w3.org/TR/user-timing/) and ‘navigation’ is from the Navigation Timing 2 spec (https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming2/Overview.html) – though I don’t know of any browsers that implement the draft spec yet.

    If you want to see ‘mark’ and ‘measure’ in action you can fire up Dev Tools on one of WebPagetest’s waterfall pages and type window.performance.getEntriesByType(‘mark’); in the console.

    window.performance.measure(“elapsed”, “aft.Site Header”, “aft.Waterfall”); can be used to generate a measure and then window.performance.getEntriesByType(‘measure’); to retrieve the list of measures.

    I agree with the recommendation to use performance.getEntriesByType(“resource”) to get the resource timing entries as it makes the code more resilient in the future, other option would be to check the array elements to ensure they have an entryType of ‘resource’ but that needs to be done everywhere!

  5. Great that Boomerang got a Ressource Timing plugin. Boomerang does HTTP/GET only, so how do we did around the 2K character limit in browsers? POSTing it feels wrong.

    I haven’t cross referenced support for Ressource Timing with max URL length, it could be that this is simply not a problem?

  6. Brian: Some beaconing packages track the querystring length and break it across multiple beacons when necessary. Doing a POST is also possible. I agree – sending back Resource Timing data for every resource in the page is likely to exceed URL length. FYI – Eric Lawrence just blogged about this in URL Length Limits.

  7. Steve: Any beaconing packages you can recommend? I don’t think Boomerang splits up into multiple beacons.

    An idea for a more robust data could be to capture aggregated data, like the ones that HTTPArchive tracks. response sizes is not possible, but you could easily aggregate: # JS/CSS/GIF/PNG/JPG/HTML/Fonts/Flash/Other requests, # HTTPS requests, # iframes, and probaly a few more.
    The more I think about it, the more I like the idea. Correlations makes it all worth it, even though SPDY may change the picture quite a lot.

  8. Firefox supports the Resource Timing API since version 31 behind the `dom.enable_resource_timing` preference flag which is `false` by default.

  9. Inspired by this article I’ve setup a bookmarklet to read visuallize the Resource and Navigation API as well as User Timing: https://github.com/nurun/performance-bookmarklet