Keys to a Fast Web App

September 6, 2012 2:27 pm | 10 Comments

I recently tweeted that the keys to a faster web app are Ajax architecture, JavaScript, and caching. This statement is based on my experience – I don’t have hard data on the contribution each makes and the savings that could be had. But let me comment on each one.

Ajax architecture – Web 1.0 with a page reload on every user action is not the right choice. Yanking the page out from under the user and reloading resources that haven’t changed produces a frustrating user experience. Maintaining a constant UI chrome with a Web 2.0 app is more seamless, and Ajax allows us to perform content updates via lightweight data API requests and clientside JavaScript resulting in a web app that is smooth and fast (when done correctly).

JavaScript – JavaScript is a major long pole in the web performance tent, but just a few years ago it was even worse. Do you remember?! It used to be that loading a script blocked the HTML parser and all other downloads in the page. Scripts were downloaded one-at-a-time! In 2009 IE8 became the first browser to support parallel script loading. Firefox 3.5, Chrome 2, and Safari 4 soon followed, and more recently Opera 12 got on the bus. (Parallel script loading is the single, most important improvement to web performance IMO.) In addition to loading scripts, the speed of the JavaScript engines themselves has increased significantly. So we’re much better off than we were a few years ago. But when I do performance deep dives on major websites JavaScript is still the most frequent reason for slow pages, especially slow rendering. This is due to several factors. JavaScript payloads have increased to ~200K. Browsers still have blocking behavior around JavaScript, for example, a stylesheet followed by an inline script can block subsequent downloads in some browsers. And until we have wider support for progressive enhancement, many webpages are blank while waiting for a heavy JavaScript payload to be downloaded, parsed, and executed.

Caching – Better caching doesn’t make websites faster for first time users. But in the context of web apps we’re talking about users who are involved in a longer session and who are likely to come back to use this web app again. In the voyage to create web app experiences that rival those of desktop and native, caching is a key ingredient. Caching frustrates me. Website owners don’t use caching as much as they should. 58% of responses don’t have caching headers, and 89% are cacheable for less than a month even though 38% of those don’t change. Browser caches are too small and their purging algorithms need updating. We have localStorage with a super simple API, but browser makers warn that it’s bad for performance. Application cache is a heavier solution, but it’s hard to work with (see also the great presentation from Jake Archibald).

I’m obsessed with caching. It’s the biggest missed opportunity and so I’m going to spend the next few months focused on caching. Analyzing caching is difficult. In the lab it’s hard (and time consuming) to test filling and clearing the cache. There’s no caching API that makes it easy to manipulate and measure.

Testing caching in the lab is informative, but it’s more important for web devs and browser makers to understand how caching works for real users. It’s been 5 years since Tenni Theurer and I ran the seminal experiment to measure browser cache usage. That study showed that 20% of page views were performed with an empty cache, but more surprisingly 40-60% of daily users had at least one empty cache page view. I’ll definitely re-run this experiment.

I’ve started my caching focus by launching a test to measure what happens when users clear their cache. It’s interesting how browsers differ in their UIs for caching. They’re neither intuitive nor consistent. I would appreciate your help in generating data. Please run the experiment and contribute your results. You can find the experiment here:

Clear Browser Experiment

I’ll write up the results this weekend. See you Monday.

 

10 Responses to Keys to a Fast Web App

  1. Awesome post, looking forward for more ideas. Willnbe good to see some ideas on how to improve the Jquery Mobile framework.

  2. bang on, on caching and JS insights.
    The trouble is caching is trickily contributing. And there are so many of them to choose from. Makes it difficult for a dev to decide, which all to use. Maybe all the levels? Makes me chuckle

    Js on the other hand, has been and will be a point of debate. Despite efforts by @getify and @jrburke, you know where, parallel loading, execution and rendering is still dependent heavily on the client.

    Best practices and libraries can only help when consumers want them to.

    But yes, I am an avid supporter for the advancement of browser support for better JS performance.

    The ‘async’ attribute was a good effort. We need more like them, deferred I believe is there in RFC’s but not supported across.

  3. Nice, but the paragraph about “AJAX architecture” is a bit weak – kind of like saying stairs produce a frustrating user experience while elevators are smooth and fast, but ignoring the fact that stairs (and page reloading) are more versatile and robust and are going to be required even if there is an elevator (or a fancy “Web 2.0″ site).

    In a lot of cases you may as well embrace this and extend the stairs pattern to the escalator (or use AJAX to progressively enhance a “Web 1.0″ site).

    Maybe this is what you mean by Web 2.0 “done correctly”, but it’s a bit of a stretch to call that AJAX architecture.
    It’s just another aspect of progressively enhanced HTML.

  4. I agree that ajax is a major step forward – in fact it is a key enabler in making the caching principles built into http work.
    The typical web page displays different data coming from different sources, and each data bit has a different ttl / cache invalidation logic. When developers have to always refresh the whole page, sending proper caching headers becomes just impossible.

    Otoh ajax is apparently still too difficult to get right: most sites which use it heavily just give a horrible user experience when network connectivity is bad or spotty: no feedback at all on failed / successful actions, etc…

  5. Coming back to the “caching” topic: in my own limited experience, there are many reasons that http-caching it is not as widely deployed as it should:
    1. different caching headers are still in widespread usage for compatibility with http 1.0. This makes even trivial introduction of caching complex
    2. caching is a difficult topic anyway (along with variable naming ;-) )
    3. as you stated, caching is not very effective for users who are not frequent, regular visitors of the site anyway
    4. the http “pull” model has excellent scalability, but caching would be much easier with a “push” model

    Point 3 is worth exploring in depth:
    - developers can not set a ttl bigger than a few hours on most resources, as they have no idea if resources will change in the future (cue the CTO slamming into your office with a new logo for the site at 5.25 pm)
    - frameworks that rename a static resource every time it is updated can go a long way in alleviating this, but they need to be more widespread/easy. Ftp upload to the web server is still unmatched for deployment
    - with a low ttl, most users, even regular ones, will still need to rely on etag to avoid re-fetching the resource on the 2nd visit, as their cache will have expired
    - proper etag usage is not widespread for anything more complex than static content, because the calculation needed to generate and verify it is often as complex as generating the whole resource anyway (and also because of some poor advice given by tools like yslow)
    - this means that for the best possible performances, developers have to use both pull/http-caching and push/intra-application-caching (caching of resources etags, esi etc)

    The best setup I have come by so far involves using a good reverse-proxy with support for Purge command and regexp-based purge lists. The developer can then set a high ttl on the resources, and have the RP cache them. When a resource is later updated at unknown time, the purge command is immediately sent to the RP.
    This involves much more work than building a simple web app:
    - installing and configuring the RP: one more piece in the architecture
    - properly generating urls for your resources which can be invalidated by a single, fast regexp (regardless of REST principles, a resource can generally have many urls for its representations)
    - setting up the workflow to send purge commands to the RP
    - set up correctly the etag and ttl headers as well

    Varnish has e.g. some smart management of purge lists, but it is complex to replicate that logic in pure php applications, as it is best done in a background thread. Which means that even frameworks like Symfony do not implement this feature in their MVC stack – which does support internally ttl-based caching of subrequests…

  6. I’m a web developer and in my opinion caching is not used as much as it could be because it’s a whopping great nightmare to deal with. There are simply not enough reliable guarantees that caching will work appropriately. It needs to be overhauled. Issues with proxy servers; the inability to mark distinct files for caching (only files by type seem trivial to cache); inconsistencies between browsers in their handling of cache (Chrome is too ‘sticky’ for example) … the list of issues is long.

    Does SPDY allow for bundled resources to minimize HTTP connections? If not, isn’t this a big area that can be improved in web design? Why have all these files downloaded individually over several different network connections when one connection could do the job? In such an approach, we should be able to bundle two different sets of resources. All files needed ASAP should be marked as such. A secondary bundle of files should be marked as downloadable after the page had loaded. This will reward developers who care about performance enough to split minimize the files required for initial rendering and hopefully reduce the situation where all files in one bundle actually slow down performance when the bundle is too large.

  7. “Better caching doesn’t make websites faster for first time users”

    It should. After all that’s what proxy caches are intended for. Unlike ‘pd’ I (think) I’ve got caching working quite well (although handling conditional requests is a bit tricky).

    While I understand SPDY’s dependence on NPN, I think it’s a missed opportunity if HTTP/2.0 depends upon SSL. OTOH universal SSL would make security a lot simper.

  8. I also don’t completely agree with “Better caching doesn’t make websites faster for first time users”

    I think you should make your HTML also cachable by a CDN that way you load the whole first page, so the HTML,CSS,JS,images, from a CDN.

    And just use Ajax, iframes or path-based cookies to handle any sessions/communication with the server.

    @symcbean I’ve listened to the recording of the IETF-meeting and I wouldn’t be surprised if HTTP/2.0 will have insecure-SSL (with http:// in the URL-bar) and secure-SSL.

  9. I agree with Sean Hogan: Javascript, and as a consequence, AJAX-y Web 2.0 features, need to be added on after you have already built a working, fully-server-based web app.

    Yes, this may mean less-than-elegant interfaces for certain features, but nonetheless everything should be doable from a dumb client.

    My web app, as an example, allows you to add an object to your favourites by clicking a hollow star. An AJAX PATCH request is performed to flip a boolean in the database, whilst the star changes to a spinning wait icon, and if a successful response is received, the star becomes filled-in. If it times out, or an error is returned, the hollow star reappears and a floating div pops up with an error message saying that the action failed and why, if known.
    However users can also go to the object’s edit page, click a checkbox, and submit a form via POST to perform the same function. Less elegant, but out of the way for most users.

    (FWIW, one guy in the office has a borked Javascript setup on his Windows box. No-one knows why.)

    It also means that you can write most unit tests for your API using simple tools like wget/curl.

  10. @nicholas

    Actually, if you have a pure JavaScript app on top of a RESTful API, that’s even easier to unit test. Plus, the entire functionality of your app is available in a clean self-documenting API. This doesn’t preclude doing a dumb frontend. You start with the API, you wrap it in a JS frontend, and if some users need to have a really dumbed down UI, you build it as a serverside API client. The benefit is that the codebase isn’t optimized around the limitations of basic html forms, but that instead it models the business domain as represented in the API.

    For sites progressive enhancement makes sense, as the breadth of users is larger. For web apps, often it doesn’t. 4 years ago i made the flip from prog-enh to a pure JS frontend backed by services. The results have been largely positive for the app i build, both in ability to deliver features and in the resulting performance / usability (the app is mainly used on corporate intranets). YMMV.