HTTP Archive: nine months
Although the HTTP Archive was announced in March, I actually started gathering data back in November of 2010. This week’s run marks nine months from that initial crawl. The trends show that performance indicators are mixed, with some critical metrics like size and redirects on the rise.
[As a reminder, the HTTP Archive currently crawls approximately 17,000 of the world’s top websites. All of the comparisons shown here are based on choosing the “intersection” of sites across all of those runs. There are ~13K sites in the intersection.]
The transfer size of pages has increased 15% (95 kB) over nine months. The average size is now up to 735 kB. Note that this is the transfer size. Many text resources (including HTML documents, scripts, and stylesheets) are compressed so the actual size is larger. The bulk of this growth has been in images – up 18% (66 kB). Scripts have had the greatest percentage increase growing 23% (25 kB).
Note that these sizes are the total size of all images in the page and all scripts in the page, respectively. The average size of individual resources has stayed about the same over this nine month period. If individual resource size is the same, how is it that the total page size has increased? The increase in total transfer size is the result of a 10% increase in HTTP requests per page – that’s seven more resources per page.
Redirects are known to cause page delays, and yet the percentage of sites containing at least one redirect increased from 58% to 64%. Requests that fail are wasteful using connections that could have been used more productively, but sites with errors grew from 14% to 25%.
All the news isn’t gloomy. The use of Google Libraries API has increased from 10% to 14%. This is good for performance because it increases the likelihood that as a user navigates across sites the most common resources will be in their cache. In addition, serving those from the Google Libraries servers might be faster and more geographically distributed for smaller sites.
The use of Flash has dropped 2% from 47% to 45% of websites. Flash resources average 58 kB which is much larger than other resources, and there are fewer tools and best practices for optimizing Flash performance.
There are still many resources that do not have the necessary HTTP response headers to make them cacheable. Luckily the trend is moving toward more caching: the 61% of resources that did not have headers to make them cacheable has dropped to 58%. Stating the inverse, the number of resources with caching headers grew from 39% to 42% (+3%).
Here’s a recap of the performance indicators from Nov 15 2010 to Aug 15 2011 for the top ~13K websites:
- total transfer size grew from 640 kB to 735 kB
- requests per page increased from 69 to 76
- sites with redirects went up from 58% to 64%
- sites with errors is up from 14% to 25%
- the use of Google Libraries API increased from 10% to 14%
- Flash usage dropped from 47% to 45%
- resources that are cached grew from 39% to 42%
My kids started school this week. I’m hoping their first report card looks a lot better than this one.