Improving app cache
I recently found out about the W3C Workshop on The Future of Off-line Web Applications on November 5 in Redwood City. I won’t be able to attend (I’ll be heading to Velocity Europe), but I feel like app cache needs improving so I summarized my thoughts and sent it to the workshop organizers. I also pinged some mobile gurus and got their thoughts on app cache.
SUMMARY: App cache is complicated and frequently produces an unexpected user experience. It’s also being (ab)used as a workaround for the fact that the browser’s cache does not cache in an effective way – this is just an arms race for finite resources.
DETAILS: I’ve spoken at many mobile-specific conferences and meetups in the last few months. When I explain the way app cache actually works, developers come up afterward and say “now I finally understand what was happening with my offline app.” These are the leading mobile developers in the world.
- HTML responses with the MANIFEST attribute are stored in app cache by default, even if they’re not in the CACHE: section of the manifest file.
- If a CACHE: resource 404s then none of the resources are cached.
- The manifest file must be changed in order for changed CACHE: resources to be updated.
- Modified CACHE: resources aren’t seen by the user until the second time they load the app – even if they’re online.
It’s easy to point out problems – you folks have the more difficult job of finding solutions. But I’ll make a few suggestions:
- Use updated resources on first load – The developer needs a way to say “if the user is online, then fetch (some/all) of the CACHE: resources that have changed before rendering the app”. I would vote to make this the default behavior, and provide a way to toggle it (in the manifest file or HTML attribute). Perhaps this should also be done at the individual resource level – “I want updated scripts to block the initial rendering, but nothing else”. The manifest file could have an indicator of which resources to check & download before doing the initial rendering.
- 404s – I haven’t tested this myself, but it seems like overkill. Every response in the CACHE: section should be cached, independent of the other responses. Perhaps this is browser-specific?
- updateReady flag – It’s great that developers can use the updateReady event to prompt the user to reload the app if any CACHE: resources have changed underneath them, but the bar is too high. In addition, have a flag that indicates that the browser should prompt the user automatically if any CACHE: resources were updated.
Finally, on the topic of arms race, I know many websites that are using app cache as a way to store images, scripts, and stylesheets. Why? It’s because the browser’s disk cache is poorly implemented. App cache provides a dedicated amount of space for a specific website (as opposed to a common shared space). App cache allows for prioritization – if I have 10M of resources I can put the scripts in the CACHE: section so they don’t get purged at the expense of less painful images.
Certainly a better solution would be for the browsers to have improved the behavior of disk cache 5 years ago. But given where we are, an increasing number of websites are consuming the user’s disk space. In most cases the user doesn’t have a way or doesn’t know how to clear app cache. Better user control over app cache is needed. I suggest that clearing “data” clears both the disk cache as well as app cache. Alternatively, we extend the browser UI to have an obvious “clear app cache” entry. Currently in Firefox and Chrome you can only clear app cache on a site-by-site basis, and the UI isn’t obvious. In Firefox it’s under Tools | Options | Advanced | Network | Remove. In Chrome it’s under chrome://appcache-internals/.
The most important near term fix is better patterns and examples.
- My first offline app had a login form on the index.html – how should I handle that?
- What if the JSON data in app cache requires authentication and the user is offline – use it or not?
- I’ve never seen an example that uses the FALLBACK: section.
Adoption of current app cache would go much more smoothly with patterns and examples that address these gaps, and perhaps a JS helper lib to wrap updateReady and other standard dev tasks.
A great email thread resulted when I asked a bunch of mobile gurus for their thoughts about app cache. Here’s a summary of the comments that resulted:
|Scott Jehl||Agreed on app cache’s clumsiness. It’s so close though! The cache clearing is terrible for both users and developers.|
|Nicholas Zakas||+1 for AppCache clumsiness. My big complaint is requiring a special MIME type for the manifest file. This effectively limits its use to people who have access to their server configuration.|
|Yehuda Katz||My biggest concern is the lack of a feature that would make it possible to load the main index.html from cache, but only if the user agent is offline.Currently, if the user agent is online, the entire cache manifest, including the main index.html, is used. As a result, developers are required to come up with some non-standard UI to let the application user know that they should refresh the page in order to get more updated information.This is definitely the way to get the most performance, even when the user agent is online, but it creates an extremely clumsy workflow which significantly impedes adoption. I have given a number of talks on the cache manifest, and this caveat is the one that change the audience reaction from nodding heads to “oh no, another thing I have to spend time working out how to rebuild my application in order to use”.
Again, I understand the rationale for the design, but I think a way to say “if the user agent is online, block until the cache manifest is downloaded” would significantly improve adoption and widen the appropriate use-cases for the technology.
|Scott Jehl||I agree – the necessary refresh is the biggest downfall for me, too. It’s really prohibitive for using appcache in progressive enhancement approaches (where there’s actually HTML content in the page that may update regularly).It’d be great if you could set up appcache to kick-in when the user is actually offline, but otherwise stay out of the way and let the browser defer to normal requests and caching.|
|Yehuda Katz||I actually think we can get away with a more aggressive approach. When the device is online, first request the application manifest. If the manifest is identical, continue using the app cache. This means a short blocking request for the app manifest, but the (good) atomic cache behavior. If the manifest is not identical, fall back to normal HTTP caching semantics.
It needs to be a single flag in the manifest I think.
|Dion Almaer||Totally agree. In a recent mobile project we ended up writing our own caching system that had us use HTTP caching… It was very much a pain to have to do this work.|
I like Yehuda’s suggestion about a blocking manifest check when the user is online controlled by a flag in the manifest file. We need more thinking around how to improve app cache. Please checkout the W3C Workshop on The Future of Off-line Web Applications website and send them your thoughts.