Unexpected prerender in Chrome
Over the last week I’ve been investigating the cacheability of resources from Shopify. I would visit the page every few hours and track the number of 200 OK versus 304 Not Modified responses. To my surprise, Chrome’s Network tab indicated that almost all the responses were “from cache”.
This didn’t make sense. In many cases the resource URLs changed between test loads. How could a never-before-seen URL be “from cache”? In cases where the URL was the same, I noticed that the Date response header had changed from the previous test but Chrome still marked it “from cache”. How could the Date change without a 200 response status code?
I started thinking about my “Prebrowsing” work (blog post, slides, video). In my findings I talk about how browsers, especially Chrome, are doing more work in anticipation of what the user needs next. This proactive work includes doing DNS lookups, establishing TCP connections, downloading resources, and even prerendering entire pages.
Was it possible that Chrome was prerendering the entire page?
I started by looking at chrome://predictors. Given characters entered into Omnibox (the location field), this shows which URL you’re predicted to go to. In my tests, I had always typed the URL into the location bar, so the predictions for “shopify” could affect Chrome’s behavior in my tests. Here’s what I found in chrome://predictors:
Chrome predicted that if I entered “www.s” into the Omnibox I would end up going to “http://www.shopify.com/” with confidence 1.0 (as shown in the rightmost column). In fact, just typing “ww” had a 0.9 confidence of ending up on Shopify. In other words, Chrome had developed a deep history mapping my Omnibox keystrokes to the Shopify website, as indicated by rows colored green.
From my Prebrowsing research I knew that if the chrome://predictors confidence was high enough, Chrome would pre-resolve DNS and pre-connect TCP. Perhaps it was possible that Chrome was also proactively sending HTTP requests before they were actually needed. To answer this I opened Chrome’s Net panel and typed “www.s” in the Omnibox but never hit return. Instead, I just sat there and waited 10 seconds. But nothing showed up in Chrome’s Net panel:
Suspecting that these background requests might not show up in Net panel, I fired up tcpdump and repeated the test – again only typing “www.s” and NOT hitting return. I uploaded the pcap file to CloudShark and saw 86 HTTP requests!
I looked at individual requests and saw that they were new URLs that had never been seen before but were in the HTML document. This confirmed that Chrome was prerendering the HTML document (as opposed to prefetching individual resources based on prior history). I was surprised that no one had discovered this before, so I went back to High Performance Networking in Google Chrome by Ilya Grigorik and scanned the Omnibox section:
the yellow and green colors for the likely candidates are also important signals for the ResourceDispatcher! If we have a likely candidate (yellow), Chrome may trigger a DNS pre-fetch for the target host. If we have a high confidence candidate (green), then Chrome may also trigger a TCP pre-connect once the hostname has been resolved. And finally, if both complete while the user is still deliberating, then Chrome may even pre-render the entire page in a hidden tab.
What started off as a quick performance analysis turned into a multi-day puzzler. The puzzle’s solution yields a few takeaways:
- Remember that Chrome may do DNS prefetch, TCP pre-connect, and even prerender the entire page based on the confidences in chrome://predictors.
- Not all HTTP requests related to a page or tab are shown in Chrome’s Net panel. I’d like to see this fixed, perhaps with an option to show these behind-the-scenes requests.
- Ilya knows everything. Re-read his posts, articles, and book before running multi-day experiments.