Reloading post-onload resources
Two performance best practices are to add a far future expiration date and to delay loading resources (esp. scripts) until after the onload event. But it turns out that the combination of these best practices leads to a situation where it’s hard for users to refresh resources. More specifically, hitting Reload (or even shift+Reload) doesn’t refresh these cacheable, lazy-loaded resources in Firefox, Chrome, Safari, Android, and iPhone.
What we expect from Reload
The browser has a cache (or 10) where it saves copies of responses. If the user feels those cached responses are stale, she can hit the Reload button to ignore the cache and refetch everything, thus ensuring she’s seeing the latest copy of the website’s content. I couldn’t find anything in the HTTP Spec dictating the behavior of the Reload button, but all browsers have this behavior AFAIK:
- If you click Reload (or control+R or command+R) then all the resources are refetched using a Conditional GET request (with the If-Modified-Since and If-None-Match validators). If the server’s version of the response has not changed, it returns a short “304 Not Modified” status with no response body. If the response has changed then “200 OK” and the entire response body is returned.
- If you click shift+Reload (or control+Reload or control+shift+R or command+shift+R) then all the resources are refetched withOUT the validation headers. This is less efficient since every response body is returned, but guarantees that any cached responses that are stale are overwritten.
Bottomline, regardless of expiration dates we expect that hitting Reload gets the latest version of the website’s resources, and shift+Reload will do so even more aggressively.
Welcome to Reload 2.0
In the days of Web 1.0, resources were requested using HTML markup – IMG, SCRIPT, LINK, etc. With Web 2.0 resources are often requested dynamically. Two common examples are loading scripts asynchronously (e.g., Google Analytics) and dynamically fetching images (e.g., for photo carousels or images below-the-fold). Sometimes these resources are requested after window onload so that the main page can render quickly for a better user experience, better metrics, etc. If these resources have a far future expiration date, the browser needs extra intelligence to do the right thing.
- If the user navigates to the page normally (clicking on a link, typing a URL, using a bookmark, etc.) and the dynamic resource is in the cache, the browser should use the cached copy (assuming the expiration date is still in the future).
- If the user reloads the page, the browser should refetch all the resources including resources loaded dynamically in the page.
- If the user reloads the page, I would think resources loaded in the onload handler should also be refetched. These are likely part of the basic construction of the page and they should be refetched if the user wants to refresh the page’s contents.
- But what should the browser do if the user reloads the page and there are resources loaded after the onload event? Some web apps are long lived with sessions that last hours or even days. If the user does a reload, should every dynamically-loaded resource for the life of the web app be refetched ignoring the cache?
Let’s look at an example: Postonload Reload.
This page loads an image and a script using five different techniques:
- markup – The basic HTML approach:
- dynamic in body – In the body of the page is a script block that creates an image and a script element dynamically and sets the SRC causing the resource to be fetched. This code executes before onload.
- onload – An image and a script are dynamically created in the onload handler.
- 1 ms post-onload – An image and a script are dynamically created via a 1 millisecond setTimeout callback in the onload handler.
- 5 second post-onload – An image and a script are dynamically created via a 5 second setTimeout callback in the onload handler.
All of the images and scripts have an expiration date one month in the future. If the user hits Reload, which of the techniques should result in a refetch? Certainly we’d expect techniques 1 & 2 to cause a refetch. I would hope 3 would be refetched. I think 4 should be refetched but doubt many browsers do that, and 5 probably shouldn’t be refetched. Settle on your expected results and then take a look at the table below.
Before jumping into the Reload results, let’s first look at what happens if the user just navigates to the page. This is achieved by clicking on the “try again” link in the example. In this case none of the resources are refetched. All of the resources have been saved to the cache with an expiration date one month in the future, so every browser I tested just reads them from cache. This is good and what we would expect.
But the behavior diverges when we look at the Reload results captured in the following table.
|Table 1. Resources that are refetched on Reload|
|technique||resource||Chrome 25||Safari 6||Android Safari/534||iPhone Safari/7534||Firefox 19||IE 8,10||Opera 12|
|1ms postonload||image 4||-||-||-||-||-||-||Y|
|5sec postonload||image 5||-||-||-||-||-||-||-|
The results for Chrome, Safari, Android mobile Safari, and iPhone mobile Safari are the same. When you click Reload in these browsers the resources in the page get refetched (resources 1&2), but not so for the resources loaded in the onload handler and later (resources 3-5).
Firefox is interesting. It loads the four resources in the page plus the onload handler’s image (image 3), but not the onload handler’s script (script 3). Curious.
IE 8 and 10 are the same: they load the four resources in the page as well as the image & script from the onload handler (resources 1-3). I didn’t test IE 9 but I assume it’s the same.
Opera has the best results in my opinion. It refetches all of the resources in the main page, the onload handler, and 1 millisecond after onload (resources 1-4), but it does not refetch the resources 5 seconds after onload (image 5 & script 5). I poked at this a bit. If I raise the delay from 1 millisecond to 50 milliseconds, then image 4 & script 4 are not refetched. I think this is a race condition where if Opera is still downloading resources from the onload handler when these first delayed resources are created, then they are also refetched. To further verify this I raised the delay to 500 milliseconds and confirmed the resources were not refetched, but then increased the response time of all the resources to 1 second (instead of instantaneous) and this caused image 4 & script 4 to be refetched, even though the delay was 500 milliseconds after onload.
Note that pressing shift+Reload (and other combinations) didn’t alter the results.
A bit esoteric? Perhaps. This is a deep dive on a niche issue, I’ll grant you that. But I have a few buts:
If you’re a web developer using far future expiration dates and lazy loading, you might get unexpected results when you change a resource and hit Reload, and even shift+Reload. If you’re not getting the latest version of your dev resources you might have to clear your cache.
This isn’t just an issue for web devs. It affects users as well. Numerous sites lazy-load resources with far future expiration dates including 8 of the top 10 sites: Google, YouTube, Yahoo, Microsoft Live, Tencent QQ, Amazon, and Twitter. If you Reload any of these sites with a packet sniffer open in the first four browsers listed, you’ll see a curious pattern: cacheable resources loaded before onload have a 304 response status, while those after onload are read from cache and don’t get refetched. The only way to ensure you get a fresh version is to clear your cache, defeating the expected benefit of the Reload button.
Here’s a waterfall showing the requests when Amazon is reloaded in Chrome. The red vertical line marks the onload event. Notice how the resources before onload have 304 status codes. Right after the onload are some image beacons that aren’t cacheable, so they get refetched and return 200 status codes. The cacheable images loaded after onload are all read from cache, so any updates to those resources are missed.
Finally, whenever behavior varies across browsers it’s usually worthwhile to investigate why. Often one behavior is preferred over another, and we should get the specs and vendors aligned in that direction. In this case, we should make Reload more consistent and have it refetch resources, even those loaded dynamically in the onload handler.