Request Timeout

November 14, 2014 3:15 am | 2 Comments

With the increase in 3rd party content on websites, I’ve evangelized heavily about how Frontend SPOF blocks the page from rendering. This is timely given the recent Doubleclick outage. Although I’ve been warning about Frontend SPOF for years, I’ve never measured how long a hung response blocks rendering. I used to think this depended on the browser, but Pat Meenan recently mentioned he thought it depended more on the operating system. So I decided to test it.

My test page contains a request for a script that will never return. This is done using Pat’s blackhole server. Eventually the request times out and the page will finish loading. Thus the amount of time this takes is captured by measuring window.onload. I tweeted asking people to run the test and collected the results in a Browserscope user test.

The aggregated results show the median timeout value (in seconds) for each type of browser. Unfortunately, this doesn’t reflect operating system. Instead, I exported the raw results and did some UA parsing to extract an approximation for OS. The final outcome can be found in this Google Spreadsheet of Blackhole Request Timeout values.

Sorting this by OS we see that Pat was generally right. Here are median timeout values by OS:

  • Android: ~60 seconds
  • iOS: ~75 seconds
  • Mac OS: ~75 seconds
  • Windows: ~20 seconds

The timeout values above are independent of browser. For example, on Mac OS the timeout value is ~75 seconds for Chrome, Firefox, Opera, and Safari.

However, there are a lot of outliers. Ilya Grigorik points out that there are a lot of variables affecting when the request times out; in addition to browser and OS, there may be server and proxy settings that factor into the results. I also tested with my mobile devices and got different results when switching between carrier network and wifi.

The results of this test show that there are more questions to be answered. It would take someone like Ilya with extensive knowledge of browser networking to nail down all the factors involved. A general guideline is Frontend SPOF from a hung response ranges from 20 to 75 seconds depending on browser and OS.

2 Responses to Request Timeout

  1. It’s worth highlighting that you’ll get different behaviors based on the state of the TCP connection. AFAIK, Pat’s blackhole server doesn’t even respond to the SYN packet… and on OSX there is, indeed, a 75s timeout for that. However, if handshake completes, that’s a whole different story. Quick example on OSX:

    $> sudo sysctl -w net.inet.tcp.keepinit=90000
    net.inet.tcp.keepinit: 75000 -> 90000
    $> date && time curl -vv blackhole.webpagetest.org
    real 1m30.535s (aka, 90s)

    However, once the TCP handshake has completed.. say we’re stuck waiting on the HTTP server, the timeout is controlled by net.inet.tcp.keepidle:

    $> sysctl net.inet.tcp.keepidle
    net.inet.tcp.keepidle: 7200000

    Needless to say, 7200000 is a large number (months). So, OS timeouts are not the limiting factor in this case. And, I think this case is fairly common: edge router / proxy terminates TLS and then hangs waiting for a response from a dead server.

  2. Hi, Ilya. I agree that it’s fairly common to have intermediaries that can make it more complicated, but I think the most common case is connecting directly to the main (hung) server. We can look at Mehdi’s post about yesterday’s DFP outage as an example. They (Catchpoint) do their testing from Windows. Consequently, his outage charts show a ~20 second increase.

    There are a lot of factors but it’s good to have an idea of the impact for the general case.

    It’d be good to get more real user metrics on these outages. Patrick Hamann tweeted some good stats. I’m hoping he can slice it by OS to shed more light.