I <3 image bytes

April 26, 2013 10:08 am | 17 Comments

Much of my work on web performance has focused on JavaScript and CSS, starting with the early rules Move Scripts to the Bottom and Put Stylesheets at the Top from back in 2007(!). To emphasize these best practices I used to say, “JS and CSS are the most important bytes in the page”.

A few months ago I realized that wasn’t true. Images are the most important bytes in the page.

My focus on JS and CSS was largely motivated by the desire to get the images downloaded as soon as possible. Users see images. They don’t see JS and CSS. It is true that JS and CSS affect what is seen in the page, and even whether and how images are displayed (e.g., JS photo carousels, and CSS background images and media queries). But my realization was JS and CSS are the means by which we get to these images. During page load we want to get the JS and CSS out of the way as quickly as possible so that the images (and text) can be shown.

My main motivation for optimizing JS and CSS is to get rendering to happen as quickly as possible.

Rendering starts very late

With this focus on rendering in mind, I went to the HTTP Archive to see how quickly we’re getting pages to render. The HTTP Archive runs on top of WebPagetest which reports the following time measurements:

  • time-to-first-byte (TTFB) – When the first packet of the HTML document arrives.
  • start render – When the page starts rendering.
  • onload – When window.onload fires.

I extracted the 50th and 90th percentile values for these measurements across the world’s top 300K URLs. As shown, nothing is rendered for the first third of page load time!

Table 1. Time milestones during page load
TTFB start render onload
50th percentile 610 ms 2227 ms 6229 ms
90th percentile 1780 ms 5112 ms 15969 ms

Preloading

The fact that rendering doesn’t start until the page is 1/3 into the overall page load time is eye-opening. Looking at both the 50th and 90th percentile stats from the HTTP Archive, rendering starts ~32-36% into the page load time. It takes ~10% of the overall page load time to get the first byte. Thus, for ~22-26% of the page load time the browser has bytes to process but nothing is drawn on the screen. During this time the browser is typically downloading and parsing scripts and stylesheets – both of which block rendering on the page.

It used to be that the browser was largely idle during this early loading phase (after TTFB and before start render). That’s because when an older browser started downloading a script, all other downloads were blocked. This is still visible in IE 6&7. Browser vendors realized that while it’s true that constructing the DOM has to wait for a script to download and execute, there’s no reason other resources deeper in the page couldn’t be fetched in parallel. Starting with IE 8 in 2009, browsers started looking past the currently downloading script for other resources (i.e, SCRIPT, IMG, LINK, and IFRAME tags) and preloading those requests in parallel. One study showed preloading makes pages load ~20% faster. Today, all major browsers support preloading. In these Browserscope results I show the earliest version of each major browser where preloading was first supported.

(As an aside, I think preloading is the single biggest performance improvement browsers have ever made. Imagine today, with the abundance of scripts on web pages, what performance would be like if each script was downloaded sequentially and blocked all other downloads.)

Preloading and responsive images

This ties back to this tweet from Jason Grigsby:

I’ll be honest. I’m tired of pushing for resp images and increasingly inclined to encourage devs to use JS to simply break pre-loaders.

The “resp images” Jason refers to are techniques by which image requests are generated by JavaScript. This is generally used to adapt the size of images for different screen sizes. One example is Picturefill. When you combine “pre-loaders” and “resp images” an issue arises – the preloader looks ahead for IMG tags and fetches their SRC, but responsive image techniques typically don’t have a SRC, or have a stub image such as a 1×1 transparent pixel. This defeats the benefits of preloading for images. So there’s a tradeoff:

  • Don’t use responsive images so that the preloader can start downloading images sooner, but the images might be larger than needed for the current device and thus take longer to download (and cost more for limited cellular data plans).
  • Use responsive images which doesn’t take advantage of preloading which means the images are loaded later after the required JS is downloaded and executed, and the IMG DOM elements have been created.

As Jason says in a follow-up tweet:

The thing that drives me nuts is that almost none of it has been tested. Lots of gospel, not a lot of data.

I don’t have any data comparing the two tradeoffs, but the HTTP Archive data showing that rendering doesn’t start until 1/3 into page load is telling. It’s likely that rendering is being blocked by scripts, which means the IMG DOM elements haven’t been created yet. So at some point after the 1/3 mark the IMG tags are parsed and at some point after that the responsive image JS executes and starts downloading the necessary images.

In my opinion, this is too late in the page load process to initiate the image requests, and will likely cause the web page to render later than it would if the preloader was used to download images. Again, I don’t have data comparing the two techniques. Also, I’m not sure how the preloader works with the responsive image techniques done via markup. (Jason has a blog post that touches on that, The real conflict behind <picture> and @srcset.)

Ideally we’d have a responsive image solution in markup that would work with preloaders. Until then, I’m nervous about recommending to the dev community at large to move toward responsive images at the expense of defeating preloading. I expect browsers will add more benefits to preloading, and I’d like websites to be able to take advantage of those benefits both now and in the future.

17 Responses to I <3 image bytes

  1. The tweet about “using JS to break preloaders” reminds me of mobify.js (http://www.mobify.com/mobifyjs/). Unfortunately it requires a script in the , but can rewrite images and other resources before the browser loads them. Haven’t used it, but learned about it recently and the approach looks promising.

  2. ^ should read: “Unfortunately it requires a script in the *head tag*, …”

  3. One technique that I really like for resp. images that I *think* works well with the preloader is the technique Guy Podjarny’s talked about in his post: Introducing LQIP – Low Quality Image Placeholders (http://www.guypo.com/feo/introducing-lqip-low-quality-image-placeholders/)

    No doubt, its not a perfect solution, but its one that I’ve found useful. Ideally browsers would implement the proposed picture element and build their APIs and preload scanner to work w/ this new element.

  4. It’s not a binary decision either. You can choose to load the most important images via the pre-parser and defer other less important imagery to load via JavaScript.

    I think this is best expressed by Paul Lloyd’s presentation at Responsiveconf a few months ago:

    https://speakerdeck.com/paulrobertlloyd/the-edge-of-the-web?slide=22

    The BBC mobile site does a good job of this:

    https://speakerdeck.com/paulrobertlloyd/the-edge-of-the-web

  5. Thanks Steve for taking the time to test and write this up. My tweet was a obvious moment of frustration. I haven’t been advocating breaking the pre-parser and in several conversations have been cautioning against it based on the feedback that you’ve given in the past.

    That said, I do feel there is a big disconnect between what browser implementers are interested in pursuing (e.g., client hints using device widths) and what people trying to make responsive design see as the true problem we’re trying to solve (e.g., element queries).

    I rarely get the sense from implementers and people working on standards that they really “get” what people are trying to accomplish with images in responsive designs. I will acknowledge that the Internet is a poor medium to convey the types of feedback that we need to know that someone has truly understood what we’re saying. So it is entirely possible that they do get it, but disagree or feel it isn’t possible.

    So there is still a part of me that feels like we should just build solutions that solve our problems, take the performance hit, and then let the adoption of those approaches prove to implementors that some rethinking needs to be done.

    And I believe this is already happening. The BBC and Guardian are doing it. Akamai sells an FEO approach that is based on wrestling control of image loading away from the pre-parser by replacing all imgs with a data uri transparent gif and then inserting the proper image source via JavaScript. Given Akamai’s marketshare and the number of large companies that are struggling with responsive images, that solution alone could break pre-parsing on a large number of sites.

    Which leads me to wonder, is it worth our while to continue to try to convince skeptical implementors or should we simply promote more solutions like what BBC, Guardian and Akamai are doing? Which is a quicker path to getting a long term solution to the problem?

    Maybe some short-term breaking of things is a necessary step to get things moving forward. That’s the the conclusion I’ve come to except in my moments of despair—the tweet you saw was one of those moments—but it is crossing my mind with increasing frequency.

  6. Argh, I kept rewriting that last sentence trying to get the tone right and still managed to end up with a broken and incoherent sentence.

    What I was trying to say is that I haven’t yet concluded that we need to take this approach, but I find myself pondering which approach would get quicker results with increasing frequency.

  7. Brett: Yes, Guypo’s (Akamai’s) LQIP would work with today’s preloaders.

    Jason: It was a good, necessary tweet. Everyone needs to get more invested. It caught my eye, and I’m sure it caught the eye of browser developers. I’m nervous about recommending techniques that make performance worse – but I’m biased. Akamai’s FEO approach (LQIP) works with plain IMG tags – so it takes advantage of the preloader. (Or is there another technique you’re referring to?) Before we advocate breaking things I’d like to have a Responsive Images Summit with web devs and browser devs and see if we can’t find common ground. You and I had a long email thread on the alternatives, and I still wasn’t closer to see a clear solution.

  8. The Akamai technique I’m talking about isn’t the LQIP that Guy has blogged about. As far as I can tell, the technique I saw is being used for both their responsive images and their on-demand image loading solutions that are options within the FEO package. I’m unclear where LQIP fits in the offering.

    For responsive images, Akamai reads the markup of the page and replaces any links to images with data uris. You can see this in action on Guy’s site:
    http://www.guypo.com/uncategorized/real-world-rwd-performance-take-2/

    A typical img tag in his source looks like this:

    img class=”alignnone wp-image-3380″ alt=”2013-page-size-per-resolution” src=”” blzsrc=”http://www.guypo.com/wp-content/uploads/2013/03/2013-page-size-per-resolution.png” blzjit=”1″ width=”511″ height=”336″

    But when the JS executes, it replaces the placeholder image with a link to the correct image. The DOM of the rendered page shows:
    img class=”alignnone wp-image-3381″ alt=”2013-page-size-small-vs-big” src=”http://www.guypo.com/wp-content/uploads/2013/03/2013-page-size-small-vs-big1.png” blzsrc=”http://www.guypo.com/wp-content/uploads/2013/03/2013-page-size-small-vs-big1.png” blzjit=”1″ width=”512″ height=”447″

    Unfortunately, most of the examples I have of Akamai’s FEO implementation are not responsive designs so I can’t point to this technique being used for responsive images, but as far as I understand, this is the approach used for any clients that sign up for the responsive images portion of the FEO solution.

  9. First off, to clarify re Akamai’s FEO: We have quite a few image related optimization, but two very commonly used ones are Responsive Images and Images On-Demand (only load images in the visible area, load others as they scroll into view). Both of those indeed use a script loader for images, breaking the pre-parser. It’s a very smart script loader, which kicks in early and squeezes the most out of the browser, but there’s no ignoring the fact it breaks the pre-parser.

    It’s worth noting that we do compensate for that a bit with other optimizations like DNS prefetching (something the pre-parser may have done) and by making scripts and CSS async (if the scripts on the page don’t block the parser, the pre-parser is less critical), but even without those we see huge value in these optimizations.

    While I didn’t collect broad data across many websites, I can say my experience shows the vast majority of pages – especially ones where at least 50% of the page is “below the fold” – benefit more from lazy loading images than from the pre-parser. The extra cost of downloading many images that aren’t actually needed (at least at first), and of those contending for bandwidth with what DOES need to get render far outweighs the value from what I can tell.

    In an old presentation of mine I show examples on mobile websites (Walmart & MSNBC’s sites at the time), where load times were drastically improved thanks to that optimization alone. That type of impact is quite common. Here’s the deck (slides 19-21): http://www.slideshare.net/guypod/unravelling-mobile-web-performance/19/

    One disclaimer on this topic: Today, the pre-parser offers relatively little value to images, because CSS and JS block the download of those images. However, in a SPDY or HTTP 2 world, it’s possible this equation would change. Hopefully by then we’ll have browser support for capabilities like the picture element or img defer (https://www.w3.org/Bugs/Public/show_bug.cgi?id=17842), and won’t need to face the dilema.

  10. Guypo: Thanks for commenting! Is “Responsive Images” different from LQIP? Do you recommend people do both? If not, which one do you recommend? Seems like the best solution would be to do the first N images (assumed to be above-the-fold) as IMG to take advantage of the preloader, and do the rest lazy loading. Have you tried that? Doing them all using a JS lazy-load technique would, again, seem to render later than using the preloader (esp. since lazy-loading doesn’t reduce the size of the image). Did you mean to say CSS and JS *DON’T* block the download of images? (You said they *DO* block which isn’t true AFAIK.) And the reason JS and CSS don’t block image downloads is because of the preloader – so images *DO* benefit from pre-parsing.

  11. Responsive Images means downloading a smaller image to a smaller screen, while LQIP means downloading a small image first and a large image after. The two have overlap in value, but they’re not the same.

    If you use LQIP only, then on a small screen you’re still wasting bytes when you download the full quality image – but at least you’re doing it after onload. If you use responsive images only, then on a large screen you have a slower user experience waiting for the full res image to download.

    What we do is combined the two – download a low quality image first, and then download the screen-size-dependent fully quality image after.

    Regarding making the first N images regular img tags, that only works if you indeed know the images would show up (and in that case, it’ll indeed make them faster). However, that’s often a difficult task, as it depend on the screen/window size. It’s even harder to estimate that in responsive websites.

    Lastly, CSS & JS *DO* block image downloads under certain conditions – which are almost always met. Pretty much every website you’ll browse has a “stair” at the front, during which JS & CSS files are downloaded (not blocking each other), and only then images are downloaded. AIUI, this is intentional by the browsers, and is their way of prioritizing JS & CSS by having them not contend for bandwidth with images.

    A few quick examples:
    CNN: http://www.webpagetest.org/result/130427_NA_91c6e9c7fb110ea81be17ead4e4f8e10/1/details/
    Fidelity: http://www.webpagetest.org/result/130427_TV_cda3854d1a54c925f0e1309fc19a8231/1/details/
    Target: http://www.webpagetest.org/result/130427_8S_0f557a96d2628f765a735bb66e0a9fb5/1/details/

  12. Guypo: Very nice combining LQIP with responsive images for the second higher resolution image. This achieves both goals – the preloader kicks off the low res image request early so the user sees something quickly, and then JS kicks off the high res image request with the appropriate size to make it faster and reduce data costs.

    I wouldn’t dismiss the “first N” idea. You don’t have to know which images would show up. If you underestimate then the other above-the-fold images will still load, just a little later. If you overestimate then you downloaded some extra images, but that’s okay, too. So you could combine first N, LQIP, and responsive images.

    When it comes to blocking, none of the major browsers block images when JS and CSS are downloading.

    Chrome assigns priorities to resources, and gives JS and CSS a higher priority than images. This prioritization can affect the order of requests and can result in the delay of image requests in some situations (to avoid bw contention as you say), but in other situations JS, CSS, and images download in parallel without blocking. For example, if the JS and CSS are in the BODY instead of HEAD then they all download together. This explains the CNN example. it was done using Chrome so there’s a lot of JS & CSS at the top (because they have a higher priority), but once the BODY is created then images download in parallel with JS & CSS, as seen in requests 124-126, 137-140, etc.

    The Fidelity and Target tests were done with IE10, which doesn’t do prioritized downloads. IN these tests we see images downloading in parallel with JS & CSS even at the very beginning. There’s more JS & CSS at the top simply because those occur first in the HEAD (where there aren’t any images), but images aren’t blocked. See request 5 in Fidelity and request 13 in Target, for example.

    The reason blocking came up was you were saying the preloader doesn’t benefit images very much. There’s certainly huge benefit when there’s a script in the BODY – the preloader will see images deeper in the page and launch those requests in parallel. There’s even benefit with lots of scripts in the HEAD because the preloader can lookahead for IMG tags and have those queued up for fetching as soon as the prioritization logic allows. Without this the image requests would go out in a more staggered fashion as the main parser encounter IMG tags as it created every DOM element in the page.

    Lots of detailed (esoteric) info there – what’s the takeaway? For me it’s: Akamai’s combined solutions looks good and plays nicely with the preloader. The preloader has benefits for images in all browsers.

  13. I have published my JS + PHP implementation called Imadaem on Github. The preloaders work as expected. I use the prerender directive and the low quality placeholders load in the background. I use Imadaem since 2 months and tested it on every modern device from the huge Samsung TV to the handy iPhone.

    Feel free to spread my solution.

  14. @Steve, your conclusion about Akamai’s combined solution assumes that customers will chose to combine LQIP with the responsive images or conditional images tools that Akamai offers. Some may, some may not.

    For those who don’t, the solution is still one that replaces img src with transparent gifs in data uris.

    I look at that and the amount of momentum that seems to be behind finding a solution for img defer or async, https://www.w3.org/Bugs/Public/show_bug.cgi?id=17842 , as supporting evidence in my argument that at minimum, there is a keen interest in placing more control over how the pre-parser handles images in the hands of developers.

    After talking to Guy during dinner last week, I realized that I’m not articulating my perspective on this issue very well. If I can’t explain it while in person, I will have a hard time doing it in a comment. I think I’m going to take a stab at writing a follow up post later this week that tries to articulate my thinking.

  15. Thanks for writing about this important topic, this is a great discussion. We have taken a slightly different approach to solving the responsive design + lazy-loading images “below the fold” in MSN.

    MSN.com for Windows 8 (http://t.msn.com on Win8/IE10; users on other browser/OS combinations will be redirected to http://www.msn.com) is a responsive site that does dynamic detection of what’s above and below the fold instead of “first N”. The site supports multiple view modes in Windows 8 including “snap” which is 320px wide. Our image tags are constructed as follows:

    The data-src attribute contains an object with that maps an appropriate image url with each view mode. It doesn’t pre-fetch the smallest image like in LQIP since image aspect ratios are not fixed across views. The _llic function which is defined inline in the head selects the appropriate URL based on the window width. It also decides whether the image is above or below the fold using the image’s position (as returned by getBoundingClientRect) and window size. If the image is above the fold, the src update is scheduled immediately which triggers the network request to fetch the actual image. Requests for all other img tags are schedule after OnLoad or a failsafe timeout, whichever comes first.

    This approach helps ensure above-the-fold images have the most bandwidth to themselves and don’t need to compete with the ones below-the-fold. Scheduling downloads via individual events also helps ensure a slow inline script in the body doesn’t block everything from being fetched, only potentially the images after the tag.

    An earlier approach was to maintain a selector list for divs known to be above-the-fold (assuming the resolution was 1080p). This would have become difficult to maintain over time and had some unwanted side-effects. Once the dynamic above-the-fold detection was implemented the image fetching function was first implemented inline at the bottom of the page; this was vulnerable to slow inline scripts so we switched to individual events per image.

    We think this technique provides a good balance between performance and responsive design.

    Example waterfall: http://www.webpagetest.org/result/130430_ZD_1c749dd064262d0b78ff2ccdbe9647c7/1/details/

    Snap view details: http://msdn.microsoft.com/en-us/library/windows/apps/hh465371.aspx#snap_ux

  16. “A smaller filesize AND a better quality on both screen types! This is impossible.”
    http://blog.netvlies.nl/design-interactie/retina-revolution/
    TR;DR: use an image size 2x as large, & use very aggressive 75% compression. You’ll need only 1 image for all pixel densities while looking great on all & use small images.

    Hat tip: Paul Verbeek comment on HTML5Rocks.

  17. I load the standard 72 dpi version of the images, then using javascript I load bigger versions accordingly (responsive sizes, higher resolutions):
    - one solution for new sites, legacy site to be upgraded and the different use cases
    - easy to implement
    - an additional check on the server will not interfere and can be implemented separately

    Works best with a dynamic image creation solution like SLIR, https://github.com/lencioni/SLIR