HTTP Archive: adding flush
In my previous post, HTTP Archive: new schema & dumps, I described my work to make the database faster, easier to download, consume less disk space, and contain more stats. Once these updates were finished I was excited to start going through the code and make pages faster using the new schema changes. Although time consuming, it’s been fun to change some queries and see the site get much faster.
Along the way I bumped into the page for viewing an individual website’s results, for example Whole Foods. Despite my schema changes, it has a slow (~10 seconds) query in the middle of the page. I’ve created a bug to figure out how to improve this (I think I need a new index), but for the short term I decided to just flush the document before the slow query. This page is long, so the slow part is well below-the-fold. By adding flush I would be able to get the above-the-fold content to render more quickly.
I wrote a blog post in 2009 describing Flushing the Document Early. It describes flushing thusly:
Flushing is when the server sends the initial part of the HTML document to the client before the entire response is ready. All major browsers start parsing the partial response. When done correctly, flushing results in a page that loads and feels faster. The key is choosing the right point at which to flush the partial HTML document response. The flush should occur before the expensive parts of the back end work, such as database queries and web service calls. But the flush should occur after the initial response has enough content to keep the browser busy. The part of the HTML document that is flushed should contain some resources as well as some visible content. If resources (e.g., stylesheets, external scripts, and images) are included, the browser gets an early start on its download work. If some visible content is included, the user receives feedback sooner that the page is loading.
My first step was to add a call to PHP’s flush function right before trends.inc which contains the slow query:
<?php flush(); require_once('trends.inc'); // contains the slow query ?>
Nothing changed. The page still took ~10 seconds to render. In that 2009 blog post I mentioned it’s hard to get the details straight. Fortunately I dug into those details in the corresponding chapter from Even Faster Web Sites. I reviewed the chapter and read about how PHP uses output buffering, requiring some additional PHP flush functions. Specifically, all existing output buffers have to be cleared with a call to ob_end_flush, a new output buffer is activated by ob_start, and this new output buffer has to be cleared using ob_flush before calling flush:
<?php // Flush any currently open buffers. while (ob_get_level() > 0) { ob_end_flush(); } ob_start(); ?> [a bunch of HTML...] <?php ob_flush(); flush(); require_once('trends.inc'); // contains the slow query ?>
After following the advice for managing PHP’s output buffers, flushing still didn’t work. Reading further in the chapter I saw that Apache has a buffer that it uses when gzipping. If the size of the output is less than 8K at the time flush is called, Apache won’t flush the output because it wants at least 8K before it gzips. In my case I had only ~6K of output before the slow query so was falling short of the 8K threshold. An easy workaround is to add padding to the HTML document to exceed the threshold:
<?php // Flush any currently open buffers. while (ob_get_level() > 0) { ob_end_flush(); } ob_start(); ?> [a bunch of HTML...] <!-- 0001020304050607080[2K worth of padding]... --> <?php ob_flush(); flush(); require_once('trends.inc'); // contains the slow query ?>
After adding the padding flushing worked! It felt much faster. As expected, the flush occurred at a point well below-the-fold, so the page looks done unless the user quickly scrolls down. The downside of adding padding to the page is a larger HTML document that takes longer to download, is larger to store, etc. Instead, we used Apache’s DeflateBufferSize directive to lower the gzip threshold to 4K. With this change the page renders faster without the added page weight.
The flush change is now in production. You can see the difference using these URLs:
These URLs open a random website each time to avoid any cached MySQL results. Without flushing, the page doesn’t change for ~10 seconds. With flushing, the above-the-fold content changes after ~3 seconds, and the below-the-fold content arrives ~7 seconds later.
I still don’t see flushing used on many websites. It can be confusing and even frustrating to setup. My responses already had chunked encoding, so I didn’t have to jump through that hoop. But as you can see the faster rendering makes a significant difference. If you’re not flushing your document early, I recommend you give it a try.
Manuel Strehl | 31-Jan-13 at 2:30 am | Permalink |
I’d love to use flushing, but the problem is, that you are usually trapped in the view, if you use an MVC approach. E.g., we develop an CMS and use Smarty templates. There is no possibility to tell Smarty mid-template, that we want to flush the thing.
The same goes without templates. For http://codepoints.net I collect the data first and then hand it over to the view (PHP file). I already need much of the data, when I put together the head’s meta elements. Then in the body the few remaining queries usually don’t make up for so much of the speed loss.
Charlie | 31-Jan-13 at 3:06 am | Permalink |
I agree with Manuel on this – you’re embedding webserver behaviour into the application and this has limited applicability. For example, what happens if you’re using Nginx instead of Apache?
Another approach is to load the various page elements asynchronously as you do with trends.
Steve Souders | 31-Jan-13 at 9:21 pm | Permalink |
Manuel: There is definitely a tradeoff between frameworks and homegrown in terms of flexibility.
Charlie: I’m not a purist when it comes to coding. I think web developers should use what they can to create a faster user experience. A rigid separation of knowledge of the transport mechanism and application logic is unrealistic on the Web, and in fact we see that line blurred frequently: the “async” tag for scripts, “prerender” and “prefetch” with links, etc. There’s also the consideration of development time. It would have taken 5x longer and 10x more code to break out the trends as an XHR. It was breathtaking to get such an improved experience in so little time.
Trey Philips | 31-Jan-13 at 10:20 pm | Permalink |
Other issues with flushing include error handling and redirects. What happens if you’ve already flushed a header with a 200, and now you realize you need to send an HTTP status code other than 200? Maybe you want to redirect somewhere with a 3xx. Maybe you now realize that object can’t be found and want to 404. Maybe you need to handle an exception or something catastrophic happens and you want to send a 5xx? The only ways that come to mind to continue are by sending a JS redirect or displaying an HTML error. In both cases, you’re sending inaccurate HTTP status codes which has side effects, like spiders (search engines, FB shares, etc.) crawling bad data.
Steffen | 01-Feb-13 at 2:50 pm | Permalink |
Trey: According to Wikipedia, “Chunked encoding allows the sender to send additional header fields after the message body.” (http://en.wikipedia.org/wiki/Chunked_transfer_encoding)
However, I guess you cannot change the HTTP status code this way because it is not an ordinary header. :-/
Anthony Hatzopoulos | 04-Feb-13 at 10:39 am | Permalink |
Great idea Steve, flushing early to make the user happier. I’ll try this.
Also worth mentioning the PHP function `ob_start` has the `output_callback` and `chunk_size` parameters which allow the customization quite easily to the output buffering prior to any template engine’s business needs to be done. So applying Steve’s 8K trick could be for example during the display logic: `ob_start(‘customCallback’, 8192)` so the site will flush every 8K chunk of bytes. Or even `ob_start(null, 8192)` with no callback. http://www.php.net/manual/en/function.ob-start.php#53117
I think most web development languages will provide a method of hooking into the output buffer callback in some form or another.
Trey: A possible solution to the header issue would be to do the apps business logic first and setting the headers in that phase. Then once in the display logic phase and rendering later using a output buffering call back that knows to split the output and send it as soon as possible so as to make the user feel the site more instantaneously.
Jorge Nerin | 07-Feb-13 at 8:49 am | Permalink |
Well, been here, done that, but instead of adding X KB of random data, output X KB of highly compressible data, like a stream of a single repeated character, this way you will fill the input buffer of apache, but the compressed output will almost be the same.
Or embed something you have to use but it’s now in an external file, like on the fly inlining a .js or .css inside the header, this way you will avoid one download without sending padding data.
Of course the best solution is to fix that query :D.
Frank van Gemeren | 18-Feb-13 at 3:13 am | Permalink |
Lack of framework support for this is unfortunately the reason why I don’t use it. If it’s just very very simple templates, it can work.
If all the possible changes to headers and for example are done at a point before the template of the comes into play, then it should be possible. You lose some flexibility tho.