Making a mobile connection
I just returned from Breaking Development Conference, an amazing gathering of many of the brightest minds in mobile web development. On the flight home I watched the video ($$) and slides from Rajiv Vijayakumar’s talk on Understanding Mobile Web Browser Performance at Velocity 2011. Rajiv works at Qualcomm where his team has done extensive performance analysis of the Android browser. Some of their findings include:
- Android 2.2 has a max of only 4 HTTP connections which limits parallel downloads. (This was increased to 8 in Android 2.3 and 35 in Android 3.1 according to Browserscope.)
- It supports pipelining for reduced HTTP overhead.
- Android’s cache eviction is based on expiration date. This is a motivation for setting expiration dates 10+ years in the future.
- Android closes TCP sockets after 6 seconds of inactivity.
This last bullet leads to an interesting discussion about the tradeoffs between power consumption and web performance.
Radio link power consumption
3G devices surfing the Web (do people still say “surfing”?) establish a radio link to the carrier’s cell tower. Establishing and maintaining the radio link consumes battery power. The following graph from Rajiv’s slides shows power consumption for an Android phone while loading a web page. It rises from a baseline of 200 mA to ~400 mA as the radio link is initialized. After the page is loaded the phone drops to 300 mA while the network is inactive. After 10 seconds of inactivity, the radio link reaches an idle state and power consumption returns to the 200 mA baseline level.
The takeaway from this graph is that closing the radio link sooner consumes less battery power. This graph shows that the radio link continues to consume battery power until 10 seconds of inactivity have passed. The 10 second radio link timer begins once the web page has loaded. But there’s also a 6 second countdown after which Android closes the TCP connection by sending a FIN packet. When Android sends the FIN packet the radio link timer resets and continues to consume battery power for another 10 seconds, resulting in a total of 16 seconds of higher battery consumption.
One of the optimizations Rajiv’s team made for the Android browser running on Qualcomm chipsets is to close the TCP connections after the page is done loading. By sending the FIN packet immediately, the radio link is closed after 10 seconds (instead of 16 seconds) resulting in longer battery life. Yay for battery life! But how does this affect the speed of web pages?
Radio link promotion & demotion
The problem with aggressively closing the phone’s radio link is that it takes 1-2 seconds to reconnect to the cell tower. The way the radio link ramps up and then drops back down is shown in the following figure from an AT&T Labs Research paper. When a web page is actively loading, the radio link is at max power consumption and bandwidth. After the radio link is idle for 5 seconds, it drops to a state of half power consumption and significantly lower bandwidth. After another 12 seconds of inactivity it drops to the idle state. From the idle state it takes ~2 seconds to reach full power and bandwidth.
These inactivity timer values (5 seconds & 12 seconds in this example) are sent to the device by the cell tower and thus vary from carrier to carrier. The “state machine” for promoting and demoting the radio link, however, is defined by the Radio Resource Control protocol with the timer values left to the carrier to determine. (The protocol dubs these timer values “T1”, “T2”, and “T3”. I just find that funny.) If the radio link is idle when you request a web page, you have to wait ~2 seconds before that HTTP request can be sent. Clearly, the inactivity timer values chosen by the carrier can have a dramatic impact on mobile web performance.
What’s your carrier’s state machine?
There’s an obvious balance, sort of a yin and yang, between power consumption and web performance for 3G mobile devices. If a carrier’s inactivity timer values are set too short, users have better battery life but are more likely to encounter a ~2 second delay when requesting a web page. If the carrier’s inactivity timer values are set too long, users might have a faster web experience but shorter battery life.
This made me wonder what inactivity timer values popular carriers used. To measure this I created the Mobile State Machine Test Page. It loads a 1 kB image repeatedly with increasing intervals between requests: 2, 4, 6, 11, 13, 16, and 20 seconds. The image’s onload event is used to measure the load time of the image. For each interval the image is requested three times, and the median load time is the one chosen. The flow is as follows:
- choose the next interval
i
(e.g, “2” seconds) - wait
i
seconds - measure
t_start
- request the image
- measure
t_end
using the image’s onload - record
t_end - t_start
as the image load time - repeat steps 2-6 two more times and choose the median as the image load time for interval
i
- goto step 1 until all intervals have been tested
The image should take about the same time to load on every request for a given phone and carrier. Increasing the interval between requests is intended to see if the inactivity timer changes the state of the radio link. By watching for a 1-2 second increase in image load time we can reverse engineer the inactivity timer values for a given carrier.
I tweeted the test URL about 10 days ago. Since then people have run the test 460+ times across 71 carriers. I wrote some code that maps IP addresses to known carrier hostnames so am confident about 26 of the carriers; the others are self-reported. (Max Firtman recommended werwar for better IP-to-carrier mapping.) I’d love to keep gathering data so:
I encourage you to run the test!
The tabular results show that there is a step in image load times as the interval increases. (The load time value shown in the table is the median collected across all tests for that carrier. The number of data points is shown in the rightmost column.) I generated the chart below from a snapshot of the data from Sept 12.
The arrows indicate a stepped increase in image load time that could be associated with the inactivity timer for that carrier. The most pronounced one is for AT&T (blue) and it occurs at the 5 second mark. T-Mobile (yellow) appears to have an inactivity timer around 3 seconds. Vodafone is much larger at 15 seconds. Sprint and AT&T Verizon have similar profiles but the step is less pronounced.
There are many caveats about this study:
- This is a small sample size.
- The inactivity timer could be affected by other apps on the phone doing network activity in the background. I asked people to close all apps, but there’s no way to verify they did that.
- A given carrier might have different kinds of networks (3G, 4G, etc.). Similarly, they might have different inactivity timer values in different regions. All of those different conditions would be lumped together under the single carrier name.
What’s the point?
Hats off to Rajiv’s team at Qualcomm for digging into Android browser performance. They don’t even own the browser but have invested heavily in improving the browser user experience. In addition to closing TCP connections once the page is loaded, they increased the maximum number of HTTP connections, improved browser caching, and more.
I want to encourage this holistic approach to mobile performance and will write about that in more depth soon. This post is pretty technical, but it’s important that mobile web developers have greater insight into the parts of the mobile experience that go beyond HTML and JavaScript – namely the device, carrier network, and mobile browser.
For example, in light of this information about inactivity timers, mobile web developers might choose to do a 1 pixel image request at a set interval that keeps the radio link at full bandwidth. This would shorten battery life, so an optimization would be to only do a few pings after which it’s assumed the user is no longer surfing. Another downside is that doing this would use more dedicated channels at the cell tower, worsening everyone’s experience.
The right answer is to determine what the tradeoffs are. What is the optimal value for these inactivity timers? Is there a sweet spot that improves web performance with little or no impact on battery life? How did the carriers determine the current inactivity timer values? Was it based on improving the user’s web experience? I would bet not, but am hopeful that a more holistic view to mobile performance is coming soon.
Derek Pennycuff | 22-Sep-11 at 6:12 am | Permalink |
What event is used to start the inactivity timer? Is there potential for a carrier erring heavily on the side of battery life (but still following the protocol spec) causing issues with lazy loading resources such as Google Analytics?
Steve Souders | 22-Sep-11 at 6:59 am | Permalink |
@Derek: The inactivity timer starts when there is no TCP traffic. Lazy loading resources after window onload would reset the inactivity timer. If the lazy loading started several seconds after window onload it’s possible the radio link would have already been demoted, but I’ve never seen lazy loading happen more than a few hundred milliseconds after onload.
Nathan | 22-Sep-11 at 8:39 am | Permalink |
Interesting article, sir. Just an FYI, the link associated with your second graph goes to the URL for the first image.
Steve Souders | 22-Sep-11 at 9:28 am | Permalink |
@Nathan: fixed – thanks!
Ido | 22-Sep-11 at 9:51 am | Permalink |
One interesting info that your test covers is that Sprint and Verizon (you wrote AT&T by mistake) have a less pronounced step. It also happen that these 2 networks are the non-GSM network. might be that the protocol is more efficient in reconnect after idle time. definitely something to look into.
Antonin Januska | 22-Sep-11 at 12:33 pm | Permalink |
Took your test. It’s an interesting view on page load times on the phone vs. on the computer. Most people see it as a 3G speed vs. 4G speed vs. cable/dsl on the computer. I suppose I never realized that the speed (3.6mbs or whatever) is inaccurate because of the 2 second startup time.
One thing I wonder is if the Android OS can be tweaked/rewritten to send out the FIN packet at different intervals, and to tweak the full-capacity time. That makes me wonder if different phones have different intervals.
Mercator | 22-Sep-11 at 12:39 pm | Permalink |
I’d like to note that this doesn’t only apply to mobile phones. I’m using a 3G modem on my (usually plugged-in) laptop.
It looks like I should have my laptop ping the tower (so to speak) every 2-4 seconds to increase my connection speed.
Steve Souders | 22-Sep-11 at 3:26 pm | Permalink |
@Ido: The fact that Sprint & Verizon are both non-GSM and have similar profiles makes me feel more confident about the data. I fixed the typo – thanks!
@Antonin: The mobile OS could send FIN packets similar to what Qualcomm did, but the RRC intervals are sent by the cell tower.
@Mercator: Sending a ping from your laptop to keep the 3G modem active would especially make sense if you were plugged in. But again, this will have a detrimental effect on the carrier network as it’ll reduced the available dedicated channels for other users.
Paul | 26-Sep-11 at 11:49 am | Permalink |
Are you aware of any changes in described browser’s behavior in case when phone is plugged to some external power source? I mean, if there is no reason to save the battery life (because of external power), does the browser/phone change it’s behavior?
Steve Souders | 26-Sep-11 at 1:44 pm | Permalink |
@Paul: I wondered that as well. You can answer this yourself – run the test on battery, then run it again while plugged in to power. I’m betting the results are the same.
David Morris | 29-Mar-12 at 12:25 pm | Permalink |
Nice work … thanks … this pretty much substantiates my theory based on some performance and battery life experiments I did as part of a project in 2008.