Browser Performance Wishlist

February 15, 2010 4:25 pm | 28 Comments

What are the most important changes browsers could make to improve performance?

This document is my answer to that question. This is mainly for browser developers, although web developers will want to track the adoption of these improvements.

Before digging into the list I wanted to mention two items that would actually be at the top of the list if it wasn’t for how new they are: SPDY and FRAG tag. Both of these require industry adoption and possible changes to specifications, so it’s too soon to put them on an implementation wishlist. I hope these ideas gain consensus soon and to facilitate that I describe them here.

SPDY
SPDY is a proposal from Google for making three major improvements to HTTP: compressed headers, multiplexed requests, and prioritized responses. Initial studies showed 25 top sites were loaded 55% faster. Server and client implementations are available, and some other organizations and individuals have completed server and client implementations. The protocol draft has been published for review.
FRAG tag
The idea behind this “document fragment” tag is that it be used to wrap 3rd party content – ads, widgets, and analytics. 3rd party content can have a severe impact on the containing page’s performance due to additional HTTP requests, scripts that block rendering and downloads, and added DOM nodes. Many of these factors can be mitigated by putting the 3rd party content inside an iframe embedded in the top level HTML document. But iframes have constraints and drawbacks – they typically introduce another HTTP request for the iframe’s HTML document, not all 3rd party code snippets will work inside an iframe without changes (e.g., references to “document” in JavaScript might need to reference the parent document), and some snippets (expando ads, suggest) can’t float over the main page’s elements. Another path to mitigate these issues is to load the JavaScript asynchronously, but many of these widgets use document.write and so must be evaluated synchronously.A compromise is to place 3rd party content in the top level HTML document wrapped in a FRAG block. This approach degrades nicely – older browsers would ignore the FRAG tag and handle these snippets the same way they do today. Newer browsers would parse the HTML in a separate document fragment. The FRAG content would not block the rendering of the top level document. Snippets containing document.write would work without blocking the top level document. This idea just started getting discussed in January 2010. Much more use case analysis and discussion is needed, culminating in a proposed specification. (Credit to Alex Russell for the idea and name.)

The List

The performance wishlist items are sorted highest priority first. The browser icons indicate which browsers need to implement that particular improvement.

download scripts without blocking
In older browsers, once a script started downloading all subsequent downloads were blocked until the script returned. It’s critical that scripts be evaluated in the order specified, but they can be downloaded in parallel. This has a significant improvement on page load times, especially for pages with multiple scripts. Newer browsers (IE8, Firefox 3.5+, Safari 4, Chrome 2+) incorporated this parallel script loading feature, but it doesn’t work as proactively as it could. Specifically:

  • IE8 – downloading scripts blocks image and iframe downloads
  • Firefox 3.6 – downloading scripts blocks iframe downloads
  • Safari 4 – downloading scripts blocks iframe downloads
  • Chrome 4 – downloading scripts blocks iframe downloads
  • Opera 10.10 – downloading scripts blocks all downloads

(test case, see the four “|| Script [Script|Stylesheet|Image|Iframe]” tests)

SCRIPT attributes
The HTML5 specification describes the ASYNC and DEFER attributesfor the SCRIPT tag, but the implementation behavior is not specified. Here’s how the SCRIPT attributes should work.

  • DEFER – The HTTP request for a SCRIPT with the DEFER attribute is not made until all other resources in the page on the same domain have already been sent. This is so that it doesn’t occupy one of the limited number of connections that are opened for a single server. Deferred scripts are downloaded in parallel, but are executed in the order they occur in the HTML document, regardless of what order the responses arrive in. The window’s onload event fires after all deferred scripts are downloaded and executed.
  • ASYNC – The HTTP request for a SCRIPT with the ASYNC attribute is made immediately. Async scripts are executed as soon as the response is received, regardless of the order they occur in the HTML document. The window’s onload event fires after all async scripts are downloaded and executed.
  • POSTONLOAD – This is a new attribute I’m proposing. Postonload scripts don’t start downloading until after the window’s onload event has fired. By default, postonload scripts are evaluated in the order they occur in the HTML document. POSTONLOAD and ASYNC can be used in combination to cause postonload scripts to be evaluated as soon as the response is received, regardless of the order they occur in the HTML document.

resource packages
Each HTTP request has some overhead cost. Workarounds include concatenating scripts, concatenating stylesheets, and creating image sprites. But this still results in multiple HTTP requests. And sprites are especially difficult to create and maintain. Alexander Limi (Mozilla) has proposed using zip files to create resource packages. It’s a good idea because of its simplicity and graceful degradation.
border-radius
Creating rounded corners leads to code bloat and excessive HTTP requests. Border-radius reduces this to a simple CSS style. The only major browser that doesn’t support border-radius is IE. It has already been announced that IE9 will support border-radius, but I wanted to include it nevertheless.
cache redirects
Redirects are costly from a performance perspective, especially for users with high latency. Although the HTTP specsays 301 and 302 responses (with the proper HTTP headers) are cacheable, most browsers don’t support this.

  • IE8 – doesn’t cache redirects for the main page and for resources
  • Safari 4 – doesn’t cache redirects for the main page
  • Opera 10.10 – doesn’t cache redirects for the main page

(test case)

link prefetch
To improve page load times, developers prefetch resources that are likely or certain to be used later in the user’s session. This typically involves writing JavaScript code that executes after the onload event. When prefetching scripts and stylesheets, an iframe must be used to avoid conflict with the JavaScript and CSS in the main page. Using an iframe makes this prefetching code more complex. A final burden is the processing required to parse prefetched scripts and stylesheets. The browser UI can freeze while prefetched scripts and stylesheets are parsed, even though this is unnecessary as they’re not going to be used in the current page. A simple alternative solution is to use LINK PREFETCH. Firefox is the only major browser that supports this feature (since 1.0). Wider support of LINK PREFETCH would give developers an easy way to accelerate their web pages. (test case)
Web Timing spec
In order for web developers to improve the performance of their web sites, they need to be able to measure their performance – specifically their page load times. There’s debate on the endpoint for measuring page load times (window onload event, first paint event, onDomReady), but most people agree that the starting point is when the web page is requested by the user. And yet, there is no reliable way for the owner of the web page to measure from this starting point. Google has submitted the Web Timing proposal draft for browser builtin support for measuring page load times to address these issues.
remote JS debugging
Developers strive to make their web apps fast across all major browsers, but this requires installing and learning a different toolset for each browser. In order to get cross-browser web development tools, browsers need to support remote JavaScript debugging. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. Agreement on the preferred protocol and support in the major browsers would go a long way to getting faster web apps for all users, and reducing the work for developers to maintain cross-browser web app performance.
Web Sockets
HTML5 Web Sockets provide built-in support for two-way communications between the client and server. The communication channel is accessible via JavaScript. Web Sockets are superior to comet and Ajax, especially in their compatibility with proxies and firewalls, and provide a path for building web apps with a high degree of communication between the browser and server.
History
HTML5 specifies implementation for History.pushState and History.replaceState. With these, web developers can dynamically change the URL to reflect the web application state without having to perform a page transition. This is important for Web 2.0 applications that modify the state of the web page using Ajax. Being able to avoid fetching a new HTML document to reflect these application changes results in a faster user experience.
anchor ping
The ping attribute for anchors provides a more performant way to track links. This is a controversial feature because of the association with “tracking” users. However, links are tracked today, it’s just done in a way that hurts the user experience. For example, redirects, synchronous XHR, and tight loops in unload handlers are some of the techniques used to ensure clicks are properly recorded. All of these create a slower user experience.
progressive XHR
The draft spec for XMLHttpRequest details how XHRs are to support progressive response handling. This is important for web apps that use data with varied response times as well as comet-style applications. (more information)
stylesheet & inline JS
When a stylesheet is followed by an inline script, resources that follow are blocked until the stylesheet is downloaded and the inline script is evaluated. Browsers should instead lookahead in their parsing and start downloading subsequent resources in parallel with the stylesheet. These resources of course would not be rendered, parsed, or evaluated until after the stylesheet was parsed and the inline script was evaluated. (test case see “|| CSS + Inline Script”; looks like this just landed in Firefox 3.6!)
SCRIPT DEFER for inline scripts
The benefit of the SCRIPT DEFER attribute for external scripts is discussed above. But DEFER is also useful for inline scripts that can be executed after the page has been parsed. Currently, IE8 supports this behavior. (test case)
@import improvements
@import is a popular alternative to the LINK tag for loading stylesheets, but it has several performance problems in IE:

  • LINK @import – If the first stylesheet is loaded using LINK and the second one uses @import, they are loaded sequentially instead of in parallel. (test case)
  • LINK blocks @import – If the first stylesheet is loaded using LINK, and the second stylesheet is loaded using LINK that contains @import, that @import stylesheet is blocked from downloading until the first stylesheet response is received. It would be better to start downloading the @import stylesheet immediately. (test case)
  • many @imports – Using @import can change the download sequence of resources. In this test case, multiple stylesheets loaded with @import are followed by a script. Even though the script is listed last in the HTML document, it gets downloaded first. If the script takes a long time to download, it can causes the stylesheet downloads to be delayed, which can cause rendering to be delayed. It would be better to follow the order specified in the HTML document. (test case)

(more information)

@font-face improvements
In IE8, if a script occurs before a style that uses @font-face, the page is blocked from rendering until the font file is done downloading. It would be better to render the rest of the page without waiting for the font file. (test case, blog post)
stylesheets & iframes
When an iframe is preceded by an external stylesheet, it blocks iframe downloads. In IE, the iframe is blocked from downloading until the stylesheet response is received. In Firefox, the iframe’s resources are blocked from downloading until the stylesheet response is received. There’s no dependency between the parent’s stylesheet and the iframe’s HTML document, so this blocking behavior should be removed. (test case)
paint events
As the amount of DOM elements and CSS grows, it’s becoming more important to be able to measure the performance of painting the page. Firefox 3.5 added the MozAfterPaint event which opened the door for add-ons like Firebug Paint Events (although early Firefox documentation noted that the “event might fire before the actual repainting happens“). Support for accurate paint events will allow developers to capture these metrics.
missing schema, double downloads
In IE7&8, if the “http:” schema is missing from a stylesheet’s URL, the stylesheet is downloaded twice. This makes the page render more slowly. Not including “http://” in URLs is not pervasive, but it’s getting more widely adopted because it reduces download size and resolves to “http://” or “https://” as appropriate. (test case)

28 Responses to Browser Performance Wishlist

  1. progressive XHR actually works in Opera. Gmail uses this technique in Opera, as it does in Firefox and Safari/Chrome.

    See http://closure-library.googlecode.com/svn/trunk/closure/goog/net/channelrequest.js

    Opera differs from Firefox, Safari, and Chrome in that it does not fire onreadystatechange when more data comes in, but readystate hasn’t actually changed from 3. This is correct according to the spec.

  2. As a developer and a recent follower of your blog, I’ll take this opportunity to say that you are a titan. It pleases me immensely that somebody is taking the time to investigate these issues so thoroughly. Thanks for all your work.

  3. I think a manifest file alone would get much of the benefit of resource packages, particularly if you were using SPDY…

    There should be some additional stuff beyond paint events — layout events for example. In Gecko, calling getComputedStyle gets something not quite the computed style, and can easily cause a full layout to get the values. If you were single stepping in a debugger and you passed that line, you would want the debugger to warn you what the browser just did, and how expensive it was.

  4. Steve, the “stylesheets & iframes” testcase doesn’t show the behavior you describe for me in Firefox, and there is no code in Gecko that would lead to such a behavior. So I have no idea where you got that from as far as Firefox is concerned.

    Implementing script DEFER for inline scripts is unfortunately not compatible with the web unless a good chunk of weirdness from IE’s processing of document.write is also implemented, because web pages call write() in deferred inline scripts today and in non-IE browsers that would cause the document to be blown away. That’s why the current HTML5 spec calls for DEFER to be ignored on inline scripts.

    Not blocking iframe downloads on scripts is a bit touchy because you have to be very sure to not fire any events from the iframe load and not run any scripts from inside the iframe until after your parent script loads and runs. It can be done, but wouldn’t be all that easy to do.

    Your “stylesheet & inline JS” testcase link doesn’t seem to go to an actual testcase….

  5. @David: Thanks for the correction. I’ll investigate and update soon.

    @William: Thanks for the words of encouragement!

    @Steven: Yes, manifest files will be a help. If SPDY was fully adopted the need for resource packages diminishes.

    @Boris: Is the stylesheet & iframe fix perhaps another 3.6 change? I’ll test older versions and report back. I wonder if something can be done with document.write in deferred scripts – perhaps browsers create a document fragment sibling after the script for output. This would fix other problem situations, as well. I always separate download from parse&execute. Browsers could just start downloading the iframe SRC but not parse or execute the response. The test case is column #8 – “|| CSS + Inline Script” – yes, a non-intuitive name.

  6. Forget limited SPDY, just move from TCP/IP to SCTP! It gives all the same features, but on protocol level, and already has support on many platforms.

  7. Steve,

    I think I was misreading your “stylesheet & iframe” test output (it doesn’t help that the very visbile number was about 4500ms, not 8000-some). I do see the issue now. I’ll dig into it. Something pretty weird is going on, since I see the problem in release builds but not debug builds…

    I don’t think you can’t represent the output of document.write as a document fragment, or a set of DOM nodes, or even a stream of tokens, because document.write can write out partial tokens (part of a start tag, the start of a comment but not its end, etc). The only sane way to represent document.write output is as a stream of characters being input into the same parser as the one parsing the main page.

    It’s not as simple as “just start downloading, but don’t execute”, actually, since a naive implementation of that would run javascript: src attributes prematurely. This would _definitely_ break sites. Like I said, it can be done but has to be done very very carefully.

    The problem with the “|| CSS + Inline script” isn’t the name; I’m used to your names. It’s that I can’t figure out a way to run the test from that page. Clicking the header (itself difficult because the big orange box covers it up when you hover it) just sorts the column as far as I can tell.

  8. Interesting. It looks like the speculative preloader never sees the part of that subframe that contains the stylesheet so never preloads the stylesheet. Since the subframe load in general is blocked on the script at the top of that subframe, which is blocked on the stylesheet in the parent document, not preloading means the load will take 8 seconds.

    The HTML5 parser fixes this, but it’s still very odd that this happens…

  9. Steve, I filed https://bugzilla.mozilla.org/show_bug.cgi?id=546426 on the sheet+iframe issue. It has nothing to do with iframes per se; you can easily hit it on a single page by just having enough text between the script that causes things to block and the images/links/scripts that need prefetching as a result.

  10. Protectd Mode suing Integrity Levels with Vista/Win7, for all browsers.

  11. Steve, don’t you think that “More cache space” should be on the list and close to the top?

    We got used to tiny caches, but there is no reason why browser makers can’t increase those puny limits on cache sizes so users don’t abuse their internet connection every time they come back?

    I know, there is still no reasonable research on the effective time for life in cache, but I have a feeling that just increasing size of the cache will help.

  12. @Boris: Thanks for all the additional investigation.

    @Sergey: My first draft list contained “Preferred Cache” where browsers are more aggressive about caching resources from the user’s most visited pages. I don’t how much of an issue cache size is. We definitely more research here.

  13. Hi Steve,

    Great stuff; I was hoping that you were thinking proactively :)

    A few notes:

    * SPDY – I would characterise SPDY as an active experiment, rather than a protocol that’s been published for review. The nuance is slight, but it’s one that the SPDY folks I’ve talked to have stressed; i.e., that it’s a work-in-progress and testbed, rather than something resembling an end product (at least at this point).

    * FRAG / script attributes – I find this very interesting, in as much as there’s a lot of tension between the declarative nature of HTML (which is very friendly to performance optimisation behind the scenes) and imperative JS (which makes optimisations very tricky).

    One thing that I’d observe is that such attributes requires authors to know when it’s appropriate to use them and when it’s not, which may be non-obvious if they’re using a lot things from pre-packaged libraries. I’m not sure that there’s an alternate approach here, so it’ll probably require documentation, education and tooling (e.g., something that analyses your code and figures out whether or not you can add frag and/or attributes to various bits of your code).

    * Resource packages – I think that there are some issues with resource packages, and I’d rather see us make HTTP better (including efforts like SPDY), rather than treating it like FTP. In many cases I suspect RP is going to mean that more bits go over the wire than necessary. I need to write about this more on my blog…

    * WebSockets – has many interesting properties, but it’s actually more brittle in the face of proxies and firewalls.

    Cheers,

  14. I’d love to see browsers come bundled with popular libraries such as jQuery, This would offer the caching advantages of Google AJAX library without the initial download request. I imagine a rel=”loadjQuery-1.4.1″ attribute on a script link could force the browser to load jQuery from disk rather than downloading from a server.

  15. @David Bloom: I now show that Opera supports progressive XHR.

    @Mark: Hi! Yes, I agree. SPDY is really the beginning of a longer conversation. “Word-in-progress” is a better way to characterize it right now. I was thinking FRAG would be obvious to use around ads, widgets, and analytics. Beyond that it could get trickier, but it’s also not as critical that it be used. I’ve been talking to people more about pipelining. Can we get that working?

  16. Duncan: I’m not sure that would be too helpful; as soon as jQuery, etc. upgrade, the benefits will be lost. People don’t upgrade browsers nearly that often.

    OTOH I think there are a LOT of improvements to be made in browser caches; from what I’ve seen, they’re poorly implemented, have primitive replacement algorithms, etc.

    By making browser caches more efficient and predictable, it’d be much more likely that jQuery, etc. were already in cache (provided people use centrally hosted versions of the libraries, such as those that Google makes available).

  17. Similar to border-radius, we can include gradient and multiple background support to this list. Gradients are as ubiquitous as rounded corners, so there’s definitely benefit in not downloading all those additional images (or heavier sprites).

  18. @Steve I agree, cache hides behind a “solved” label and it’s not clear if it actually is.

  19. Unassertive cache policies are definitely low hanging-fruit here. The web would be much faster if browsers (especially IE) just obeyed the caching headers we give them. Larger caches would also help. In my research, browsers are requesting resources over 100 times more than they should.

  20. remote JS debugging: use DBGP, an open debugger protocol for dynamic languages, developed by myself (Shane Caraveo, ActiveState) and Derick Rethans (PHPs XDebug), and used by tons of IDEs and editors (Komodo, Eclipse, even vi). Komodo has had remote debugging via Firefox for a long time.

  21. @Mark, Sergey, Brian: I’ve talked with many people who think we need better browser caching. I totally agree that we can measure how frequently users don’t have resources cached (like the experiment I designed at Yahoo). But it’s harder to determine why they don’t have resources cached as much as we would expect. Are caches too small and resources are being pushed out? Are users clearing their caches? Is anti-virus software clearing their caches? None of these seem like a plausible explanation. Until we do more studies, I’m not convinced increasing cache size will make much difference. Anyone have data on this?

  22. One of things SPDY has going for it is that it is not http, and therefore does not have the weight of dealing with old implementation issues of various parties. Just look at why Mozilla won’t ship with pipelining enabled and you realize that starting with a clean slate has lots of advantages.

    And dropping tcp would have some advantages too, sctp might do as a nice replacement, and it warms up better.

  23. Opera’s Dragonfly tool already offers remote debugging. It is also on the road map for Firebug 1.7.

  24. SCTP is nice but impractical. It does not work through existing NATs or firewalls, which makes it virtually impossible to deploy.

  25. remote debugging must be enabled core of browser. Also I couldnt see any performance criteria for media content.

  26. Can we add support for the other HTTP methods in here? I don’t know if this is a browser limitation or a HTML limitation.

  27. I have over 100 images on my category pages, so something like base href would be nice for example

    instead of 100+ instances per page of

    I would have something like the following in the header

    and in the body

    That should reduce page bloat.

  28. argh where’s my markup ?

    The above post looks daft, okay I’ll try to explain without the markup

    instead of 100+ instances of

    img src= images.mysite.com/images/ the-world.jpg

    I’d have in the header

    imgbase ref= mages.mysite.com/images/

    and in the body

    img src= the-world.jpg