HTTP Archive: jQuery
A recent thread on Github for html5-boilerplate discusses whether there’s a benefit from loading jQuery from Google Hosted Libraries, as opposed to serving it from your local server. They referenced the great article Caching and the Google AJAX Libraries by Steve Webster.
Steve(W)’s article concludes by saying that loading jQuery from Google Hosted Libraries is probably NOT a good idea because of the low percentage of sites that use a single version. Instead, developers should bundle jQuery with their own scripts and host it from their own web server. Steve got his data from the HTTP Archive – a project that I run. His article was written in November 2011 so I wanted to update the numbers in this post to help the folks on that Github thread. I also raise some issues that arise from creating combined scripts, especially ones that result in sizes greater than jQuery.
Preamble
SteveW shows the SQL he used. I’m going to do the same. As background, when SteveW did his analysis in November 2011 there were only ~30,000 URLs that were analyzed in each HTTP Archive crawl. We’re currently analyzing ~300,000 per crawl. So this is a bigger and different sample set. I’m going to be looking at the HTTP Archive crawl for Mar 1 2013 which contains 292,297 distinct URLs. The SQL shown in this blog post references these pages based on their unique pageids: pageid >=Â 6726015 and pageid <= 7043218
, so you’ll see that in the queries below.
Sites Loading jQuery from Google Hosted Libraries
The first stat in SteveW’s article is the percentage of sites using the core jQuery module from Google Hosted Libraries. Here’s the updated query and results:
mysql> select count(distinct(pageid)) as count, (100*count(distinct(pageid))/292297) as percent from requests where pageid >= 6726015 and pageid <= 7043218 and url like "%://ajax.googleapis.com/ajax/libs/jquery/%"; +-------+---------+ | count | percent | +-------+---------+ | 53414 | 18.2739 | +-------+---------+
18% of the world’s top 300K URLs load jQuery from Google Hosted Libraries, up from 13% in November 2011.
As I mentioned, the sample size is much different across these two dates: ~30K vs ~300K. To do more of an apples-to-apples comparison I restricted the Mar 1 2013 query to just the top 30K URLs (which is 28,980 unique URLs after errors, etc.):
mysql> select count(distinct(p.pageid)) as count, (100*count(distinct(p.pageid))/28980) as percent from requests as r, pages as p where p.pageid >= 6726015 and p.pageid <= 7043218 and rank <= 30000 and p.pageid=r.pageid and r.url LIKE "%://ajax.googleapis.com/ajax/libs/jquery/%"; +-------+---------+ | count | percent | +-------+---------+ | 5517 | 19.0373 | +-------+---------+
This shows an even higher percentage of sites loading jQuery core from Google Hosted Libraries: 19% vs 13% in November 2011.
Most Popular Version of jQuery from Google Hosted Libraries
The main question being asked is: Is there enough critical mass from jQuery on Google Hosted Libraries to get a performance boost? The performance boost would come from cross-site caching: The user goes to site A which deposits jQuery version X.Y.Z into the browser cache. When the user goes to another site that needs jQuery X.Y.Z it’s already in the cache and the site loads more quickly. The probability of cross-site caching is greater if sites use the same version of jQuery, and is lower if there’s a large amount of version fragmentation. Here’s a look at the top 10 versions of jQuery loaded from Google Hosted Libraries (GHL) in the Mar 1 2013 crawl.
mysql> select url, count(distinct(pageid)) as count, (100*count(distinct(pageid))/292297) as percent from requests where pageid >= 6726015 and pageid <= 7043218 and url LIKE "%://ajax.googleapis.com/ajax/libs/jquery/%" group by url order by count desc;
Table 1. Top Versions of jQuery from GHL Mar 1 2013 | |
jQuery version | percentage of sites |
---|---|
1.4.2 (http) | 1.7% |
1.7.2 (http) | 1.6% |
1.7.1 (http) | 1.6% |
1.3.2 (http) | 1.2% |
1.7.2 (https) | 1.1% |
1.8.3 (http) | 1.0% |
1.7.1 (https) | 0.8% |
1.8.2 (http) | 0.7% |
1.6.1 (http) | 0.6% |
1.5.2 (http) 1.6.2 (http) |
0.5% (tied) |
That looks highly fragmented. SteveW saw less fragmentation in Nov 2011:
Table 2. Top Versions of jQuery from GHL Nov 15 2011 | |
jQuery version | percentage of sites |
---|---|
1.4.2 (http) | 2.7% |
1.3.2 (http) | 1.3% |
1.6.2 (http) | 0.8% |
1.4.4 (http) | 0.8% |
1.6.1 (http) | 0.7% |
1.5.2 (http) | 0.7% |
1.6.4 (http) | 0.5% |
1.5.1 (http) | 0.5% |
1.4 (http) | 0.4% |
1.4.2 (https) | 0.4% |
Takeaways #1
Here are my takeaways from looking at jQuery served from Google Hosted Libraries compared to November 2011:
- The most popular version of jQuery is 1.4.2 in both analyses. Even though the percentage dropped from 2.7% to 1.7%, it’s surprising that such an old version maintained the #1 spot. jQuery 1.4.2, was released February 19, 2010 – over three years ago! The latest version, jQuery 1.9.1, doesn’t make it in the top 10 most popular versions, but it was only released on February 4, 2013. The newest version in the top 10 is jQuery 1.8.3, which is used on 1% of sites (6th most popular). It was released November 13, 2012. The upgrade rate on jQuery is slow, with many sites using versions that are multiple years old.
- There is less critical mass on a single version of jQuery compared to November 2011: 2.7% vs. 1.7%. If your site uses the most popular version of jQuery the probability of users benefiting from cross-site caching is lower today than it was in November 2011.
- There is more critical mass across the top 10 versions of jQuery. The top 10 versions of jQuery accounted for 8.8% of sites in November 2011, but has increased to 10.8% today.
- 8% of sites loading jQuery from Google Hosted Libraries add a query string to the URL. The most popular URL with a querystring is
/ajax/libs/jquery/1.4.2/jquery.min.js?ver=1.4.2
. (While that’s not surprizing, the second most popular URL is:/ajax/libs/jquery/1.7.1/jquery.min.js?ver=3.5.1
.) As SteveW pointed out, this greatly reduces the probability of benefiting from cross-site caching because the browser uses the entire URL as the key when looking up files in the cache. Sites should drop the querystring when loading jQuery from Google Hosted Libraries (or any server for that matter).
The Main Question
While these stats are interesting, they don’t answer the original question asked in the Github thread: Which is better for performance: Loading jQuery from Google Hosted Libraries or from your own server?
There are really three alternatives to consider:
- Load core jQuery from Google Hosted Libraries.
- Load core jQuery from your own server.
- Load core jQuery bundled with your other scripts from your own server.
I don’t have statistics for #3 in the HTTP Archive because I’m searching for URLs that match some regex containing “jquery” and it’s unlikely that a website’s combined script would preserve that naming convention.
I can find statistics for #2. This tells us the number of sites that could potentially contribute to the critical mass for cross-site caching benefits if they switched from self-hosting to loading from Google Hosted Libraries. Finding sites that host their own version of jQuery is difficult. I want to restrict it to sites loading core jQuery (since that’s what they would switch to on Google Hosted Libraries). After some trial-and-error I came up with this long query. It basically looks for a URL containing “jquery.[min.].js”, “jquery-1.x[.y][.min].js”, or “jquery-latest[.min].js”.
select count(distinct(pageid)) as count, (100*count(distinct(pageid))/292297) as percent from requests where pageid >= 6726015 and pageid <= 7043218 and ( url like "%/jquery.js%" or url like "%/jquery.min.js%" or url like "%/jquery-1._._.js%" or url like "%/jquery-1._._.min.js%" or url like "%/jquery-1._.js%" or url like "%/jquery-1._.min.js%" or url like "%/jquery-latest.js%" or url like "%/jquery-latest.min.js%" ) and mimeType like "%script%"; +--------+---------+ | count | percent | +--------+---------+ | 164161 | 56.1624 | +--------+---------+
Here are the most popular hostnames across all sites:
Table 3. Top Hostnames Serving jQuery Mar 1 2013 | |
hostname | percentage of sites |
---|---|
ajax.googleapis.com | 18.3% |
code.jquery.com | 1.4% |
yandex.st | 0.3% |
ajax.aspnetcdn.com | 0.2% |
mat1.gtimg.com | 0.2% |
ak2.imgaft.com | 0.1% |
img1.imgsmail.ru | 0.1% |
www.yfum.com | 0.1% |
img.sedoparking.com | 0.1% |
www.biggerclicks.com | 0.1% |
Takeaways #2
- 56% of sites are using core jQuery. This is very impressive. This is similar to the findings from BuiltWith (compared to “Top 100,000” trends). The percentage of sites using some portion of jQuery is even higher if you take into consideration jQuery modules other than core, and websites that bundle jQuery with their own scripts and rename the resulting URL.
- 38% of sites are loading core jQuery from something other than Google Hosted Libraries (56% – 18%). Thus, there would be a much greater potential to benefit from cross-site caching if these websites moved to Google Hosted Libraries. Keep in mind – this query is just for core jQuery – so these websites are already loading that module as a separate resource meaning it would be easy to switch that request to another server.
- Although the tail is long, Google Hosted Libraries is by far the most used source for core jQuery. If we want to increase the critical mass around requesting jQuery, Google Hosted Libraries is the clear choice.
Conclusion
This blog post contains many statistics that are useful in deciding whether to load jQuery from Google Hosted Libraries. The pros of requesting jQuery core from Google Hosted Libraries are:
- potential benefit of cross-site caching
- ease of switching if you’re already loading jQuery core as a standalone request
- no hosting, storage, bandwidth, nor maintenance costs
- benefit of Google’s CDN performance
- 1-year cache time
The cons to loading jQuery from Google Hosted Libraries include:
- an extra DNS lookup
- you might use a different CDN that’s faster
- can’t combine jQuery with your other scripts
There are two other more complex but potentially significant issues to think about if you’re considering bundling jQuery with your other scripts. (Thanks to Ilya Grigorik for mentioning these.)
First, combining multiple scripts together increases the likelihood of the resource needing to be updated. This is especially true with regard to bundling with jQuery since jQuery is likely to change less than your site-specific JavaScript.
Second, unlike an HTML document, a script is not parsed incrementally. That’s why some folks, like Gmail, load their JavaScript in an iframe segmented into multiple inline script blocks thus allowing the JavaScript engine to parse and execute the initial blocks while the file is still being downloaded. Combining scripts into a single, large script might reach the point where delayed parsing would be offset by downloading two or more scripts. As far as I know this has not been investigated enough to determine how “large” the script must be to reach the point of negative returns.
If you’re loading core jQuery as a standalone request from your own server (which 38% of sites are doing), you’ll probably get an easy performance boost by switching to Google Hosted Libraries. If you’re considering creating a combined script that includes jQuery, the issues raised here may mean that’s not the optimal solution.
SteveW and I both agree:Â To make the best decision, website owners need to test the alternatives. Using a RUM solution like Google Analytics Site Speed, Soasta mPulse, Torbit Insight, or New Relic RUMÂ will tell you the impact on your real users.
Johan Sundström | 18-Mar-13 at 5:04 pm | Permalink |
It bares stressing that if you update your own js fairly often, not bundling static libs benefit frequent users a lot, especially huge ones like jQuery, even before weighing in cross-site cache reuse benefits. People likely to read this post are probably also likely the kind whose js code updates more often than most sites.
David Higgins | 18-Mar-13 at 5:24 pm | Permalink |
It is worth investing in your own CDN rig/setup.
If you’re grabbing the latest version of jQuery from Google, many scripts could break without you even knowing it.
That scenario is a sort of dependency hell we need to avoid.
I am aware webmasters like to stick with one solid version of jQuery for peace of mind, like 1.8, or the even safer 1.4 variety to avoid the latest version of jQuery breaking our plugins, but…
A single point of failure like Google’s AJAX CDN could cause problems too:
Call me extreme in my view, but aren’t self-hosted solutions a more ethical approach to serving scripts?
The independent approach has no caveats because you’re in control of the server. If it goes down – setup a new instance.
Also – you’re not tracked by Google. How many times is your IP logged each time jQuery is served from Google’s CDN? It’s a question worth asking too.
David Higgins | 18-Mar-13 at 5:29 pm | Permalink |
Also I forgot to add:
jQuery weighs in at little more than the size of a standardsized PNG file. So it’s equivalent to loading a small graphic.
It’s not the bloated monster many blog authors like to claim it is ;) Food for thought.
Paul Irish | 18-Mar-13 at 6:30 pm | Permalink |
I think the other takeaway is, if you do use the Google Hosted jQuery, then the more you keep it up to date with the latest version. 1) better the chance it’ll be a cache-hit and 2) likely improved speed due to the updated library.
Nils Diewald | 18-Mar-13 at 6:31 pm | Permalink |
A while ago I proposed a cache attribute (or something similar) to the script element (and others) to a Mozilla developer on a conference, allowing a developer to give a cache hint to the browser when using common resources.
In case the browser has this exact resource already in cache, it won’t load from the example server.
In other case it would load the resource, generating the md5 from the file and storing it in the cache.
That way I wouldn’t need to guide my users to Google, but could benefit from the cache as well.
What do you think about this proposal? Maybe in the meantime there is something like that or there’s a security issue I don’t know about.
I never use Google’s CDN for jquery because of the SPOF issue and privacy reasons.
Nathan Toone | 18-Mar-13 at 6:34 pm | Permalink |
It would be cool to see numbers for other JavaScript libraries that are hosted on google hosted libraries (dojo, prototype, ext core, etc). I know none of them (likely) have the potential that jquery has for cross-site caching, but some numbers for them might be interesting too.
Steve | 18-Mar-13 at 6:35 pm | Permalink |
I notice you’re searching against “://ajax.googleapis.com”, whereas it might be more common (and recommended) for developers to use the protocol-less “//ajax.googleapis.com”. Or have these URL’s already been normalised to their absolute value in this table?
Nicolas Gallagher | 18-Mar-13 at 6:46 pm | Permalink |
Thanks for the thorough research and write up.
Another factor to consider is the popularity of a site that uses the Google CDN. It seems to me that if Facebook et al were to use the Google CDN, then the chances of your visitors having a version cached would be greater than if the sites using Google’s CDN were just low traffic sites. But most large sites prefer to use their “own” CDN, rather than a shared URL.
In many cases, you’d want to concatenate your core libraries into their own bundle that’s separate from your app’s JS code. That way, you can ensure that you only send new versions of the library code when it actually changes (which should be at a much slower rate than the rest of your code). So perhaps the value of using Google’s CDN is further reduced if you rely on several core libs.
Rob Larsen | 18-Mar-13 at 7:10 pm | Permalink |
“Another factor to consider is the popularity of a site that uses the Google CDN.”
As I mentioned in the github thread, the top site that I’ve seen using the CDN is Pinterest (1.7.1.) I’d like to see Alexa rank (or whatever) vs CDN usage. That’s always been a missing piece of this analysis for me.
That said, and I’ll write this up in the issue, I agree that we should keep the code the way it is. As I’ve looked at this over the past week I’ve settled on the CDN being the best default. Documenting the advanced options and pushing people towards looking beyond the default is definitely a part of that, but the best default is using the CDN.
Steve Souders | 18-Mar-13 at 7:13 pm | Permalink |
Johan: Exactly – that’s what I was trying to say with “combining multiple scripts together increases the likelihood of the resource needing to be updated”. It might be better to load jQuery as a separate request than combining it with other scripts that change more frequently.
David: You bring up a good point about SPOF. That is an issue. It would be great to have some Google Hosted Libraries uptime stats. On your other comment, please note that the average size of a PNG is 13K whereas core jQuery 1.9.1 is 33K and 1.4.2 is 25K (minified & gzipped).
Nils: I think it’s possible to figure out a way to share a “standard” script across sites even if the URL differs, but there are likely security concerns. I’ll ask the security folks out there to comment.
Steve Souders | 18-Mar-13 at 7:19 pm | Permalink |
Nathan: This post took a long time. All the data is open source and available for download. Using my queries as a guide, you could do the study for other libraries. If so, please publish the results.
Steve: The URLs in the database are the actual URL fetched by the browser. (That’s how I’m able to show http vs https in the table.)
Steve Souders | 18-Mar-13 at 7:27 pm | Permalink |
Nicolas & Rob: Certainly if large sites used a common resource that would increase the probability of cross-site caching. There are several sites in the Top 100 that use Google Hosted Libraries including Pinterest, Babylon, Go, and Stack Overflow.
Daniel Lo Nigro | 18-Mar-13 at 7:43 pm | Permalink |
FYI, the “?ver=3.5.1” is coming from WordPress, which appends the WordPress version to the end of all the JavaScript URLs.
Dave Ward | 18-Mar-13 at 8:04 pm | Permalink |
One important thing to consider along with the potential drawback of a DNS lookup is that jQuery isn’t the only library hosted at ajax.googleapis.com. So, the potential for users to already have the DNS lookup cached locally is better than all versions of jQuery (on the Google CDN) combined *and* isn’t impacted by fragmentation. I don’t have hard data, but I think this is much less a drawback than some people have claimed.
I’ve actually just recently started A/B testing visits to my own site. ~50% of my visitors get the Google CDN and the others get a local copy served from my Linode instance. I record a Date().getTime() before and after and then push the interval between them into Google Analytics’ Site Speed User Timings.
So far, the numbers have been very clearly in favor of the CDN. For example, my Russian visitors are quite a bit better off when they visit a page with a CDN reference: http://encosia.com/i/google-cdn-russia/
Even in Georgia (US), where my Linode instance lives, the CDN is over twice as fast as serving jQuery locally.
I’ll have more data to share in a few weeks, after I’ve gathered 100k or so samples, but the CDN is the clear winner so far.
Steve Souders | 18-Mar-13 at 8:46 pm | Permalink |
Daniel: Ahhh – that explains it. Thanks.
Dave: Wow! It’s great that you shared real user data for comparing Google Hosted Libraries to your own server. Please do comment again when you have more data.
Ilya Grigorik | 18-Mar-13 at 11:58 pm | Permalink |
Nils, David: there is no need for an additional fingerprint. That’s exactly what the URL is: a unique identifier which can be shared between any sites. That’s the whole premise of a CDN. Re, SPOF: If you think your server stands a better chance against SPOF than a globally distributed cache running in hundreds of locations around the world.. good for you! Mine isn’t.
Charlie Clark | 19-Mar-13 at 12:56 am | Permalink |
Great article. I think one point is that JQuery, like Modernizr, should almost be thought of as infrastructure with developers almost taking it for granted it that they can use it, as they might a particular image format. For users, the biggest performance improvements will be when the libraries are no longer needed because the functions are directly available. In the meantime, browser makers wanting to seem faster might think about caching the most popular versions of the library directly in the browser and thus avoiding both the download and the blocking parsing of it. The popularity of JQuery is allowing a carrot and stick development approach with both 1.9 and 2.0 dropping features to improve load time and performance.
Re. old versions: httparchive itself is still on 1.5.1. The latest version you can use is 1.8.3 because of HarViewer, which itself comes with an older version embedded. Problems like this are exactly why people are reluctant to update to newer versions – it involves work – and why we might need to start thinking differently about how we use them.
Andy Davies | 19-Mar-13 at 2:16 am | Permalink |
I’m part way through doing a similar thing but looking at wider jQuery use.
I ran a map/reduce job over the data where the mapper stripped off the protocol and query string.
Before removing the protocol and query string my top ten looks exactly the same as yours but after it’s a little bit different (March 2013 data):
ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js 8330
ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js 7784
ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js 6221
ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js 3763
ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js 3722
ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js 2771
ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js 2746
ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.min.js 2525
ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js 2066
ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js 2025
Digging amongst the data there are some quite interesting things like the number of people who are brave enough to always use the latest version –
http://code.jquery.com/jquery-latest.js 888
http://code.jquery.com/jquery-latest.min.js 611
(A client job is taking priority ATM but will hopefully I’ll get to finish the analysis next week)
David Goss | 19-Mar-13 at 3:47 am | Permalink |
Say a user already has jQuery 1.4.2 in their cache from a page they visited earlier that used the Google CDN, then they come to yours which wants jQuery 1.9.1 from the same CDN. They still have to download the file, but the previous request at least saves them the DNS lookup, doesn’t it?
blufive | 19-Mar-13 at 4:06 am | Permalink |
Another variable: if your site is pulling in multiple files which could come from the google CDN (e.g. jquery + jqueryui + a couple more) and a bunch of other stuff as well, doesn’t this turn into a sort of poor-man’s domain sharding, with the common libs coming from google and other stuff from your own site?
Nils | 19-Mar-13 at 4:22 am | Permalink |
Ilya: Thanks for your comment. Regarding SPOF: My concern is not that a CDN may go down due to technical problems or something … I agree that a single server is obviously more likely to not be reachable than a CDN – based on technical issues.
The concern regarding SPOF in relation to CDN mainly focuses on national authorities in certain countries blocking the domain of a CDN, see e.g. https://stevesouders.com/blog/2012/03/28/frontend-spof-in-beijing/ .
A cache fingerprint would allow the use of scripts loaded from a CDN while also providing a fallback.
It’s also not certain in all jurisdictions that a web site owner is allowed to force a visitor giving her (at least IP) data to a 3rd party without noticing (that’s the reason it was critical to use Google Analytics in Germany for quite some time).
Rob Larsen | 19-Mar-13 at 7:45 am | Permalink |
I don’t actually know enough about the DNS lookup/caching piece to have an informed opinion of the likelihood of it being cached, but my uninformed opinion is to agree with Dave Ward that it’s going to be less of an issue than some people are making it out to be.
Additionally, in the examinations I’ve done of this specifically for the h5bp issue, one of the big variables I’ve seen is the quality of the server itself. If the server is fast or well configured (for example, running mod_pagespeed) then going with a local copy can be faster. With the standard hosting set-up, just the optimized delivery of the Google CDN makes a huge difference (not geographic optimization, just the fact the server is fast) I think this is an important piece that we’ve got to take into account- at least on the h5bp project, where we’re looking at such a broad audience (many of whom have no access to server configuration.)
Steve Souders | 19-Mar-13 at 8:45 am | Permalink |
David Goss: DNS resolutions cache times are typically short – minutes – but it depends on the TTL, OS, browser, and user activity. See Reduce DNS Lookups for more info.
blufive: Yes, using Google Hosted Libraries would provide a degree of domain sharding.
Nils: Your comment reminds me that web devs need better patterns to deal with script loading errors. Most sites don’t even check onerror. But esp. if a site is loading scripts dynamically (and asynchronously) it would be good to have a pattern that checked the onerror and loaded the resource from a fallback source. Similarly, this could be done after N seconds as a fallback. (Sometimes firewalls delay responses rather than failing them immediately.)
Nicholas Shanks | 19-Mar-13 at 9:51 am | Permalink |
I’ve left my extensive comments and numbers on the h5bp discussion. Summary: It makes no sense NOT to use the Google CDN.
Nicholas Shanks | 19-Mar-13 at 10:11 am | Permalink |
By the way, I did my own analysis of this 6 months ago and got 1.4.2 as top (10% of CDN requests) with 1.7.1 second (8%): http://nickshanks.com/httparchive/jquery_cdn_usage_2012-09-01.csv
Steve, you should mention that the /x.x/ “latest” version URLs only have a one hour lifespan.
Jason | 19-Mar-13 at 11:11 am | Permalink |
I have done a bit of work with content distribution and verification.
As a general rule, any system that tries to maintain integrity can never let one ‘person’ in the system volunteer information on behalf of anybody else. I don’t have authority to do that, and you should not grant it to me, even by accident. Depending on circumstances it may or may not be reasonable for me to consent to accept information from someone else. But since this is 3rd party code running on behalf of the user, which of us actually needs to consent? The author or the visitor?
If I say my file is the same as one from Google, but it turns out it isn’t, then only my site malfunctions. If I can get my script to load on your website, then we have a big problem. So if I had a script tag that said, “Fetch this from Google unless the firewall stops you, in which case load it from foo” Then if the Google lookup fails I could really only expect that the browser puts the cache entry under ‘foo’ and not ‘www.google.com’. That is, the aliases can only go in one direction, not both.
David Murdoch | 19-Mar-13 at 12:47 pm | Permalink |
First, we’ve know for a while that the chance of winning the cache lottery is slim to none. Hitting the cache is just a bonus.
Second, the cost of DNS resolution is likely a moot point for a couple of reasons: it is super fast, browsers are starting to do dns-prefetching, and the chances of the Google CDN domain name already being resolved is extremely high (so it is probably free anyway – an analysis of *//ajax.googleapis.com usage would be nice).
I think one major point that keep getting overlooked is that we need to set the best default for the H5BP community as a whole.
I don’t think that analyzing the top 30k sites is indicative of the developers within the H5BP community.
What we need to be looking at is the use of H5BP by your average developer and their clients. I’d wager that most *domains* using H5BP do not put static resources on a CDN, and most sites live on a slow, shared host with limited bandwidth.
Lastly, the SPOF issue has already been solved by the H5BP team; if the Google CDN version does not load a valid JS file that initializes `window.jQuery` then a local version will be loaded instead.
Jay | 20-Mar-13 at 2:36 am | Permalink |
Nils: I had a similar idea but thought of adding canonical= to accompany src=, so you can provide the url where it should be loaded from if uncached, and a different url it would be acceptable to receive a cached version from.
Canonical would be a one way relationship, so we don’t have to deal with weird trust/security issues. It’s true that would do nothing to help ensure that people end up with a shared version in their cache, but browser makers could do some smart precaching, i.e. deciding to prefetch and cache a certain version of jquery from Google because it has seen x canonical references in the last y visited sites by the user, etc.
Nils | 20-Mar-13 at 6:06 am | Permalink |
Jay: I agree, for developers a canonical=” like attribute would be a lot easier to implement – that’s a good idea!
But regarding trust/security: You have still to trust the CDN not to inject any unwanted code (which may not be of criminal intent, just, e.g. delivering the wrong version of a library). Providing a fingerprint would prevent that. And you can only cache from one CDN provider. I guess for web developers aiming for speed as well as security and privacy, looking up an md5 isn’t that hard (on the other side they would not have to lookup the url(s) of CDN provided versions of their software).
Asset pipelines or taghelpers could also generate the fingerprints easily, so it wouldn’t bother the developer either.
But I am still curious if there are any greater security issues with caching hints than with CDN URLs.
Mike Behnke | 21-Mar-13 at 6:39 am | Permalink |
Thanks David, for pointing out that SPOF has been solved. I have no concerns about the Google CDN being down, because if it is I just load jQuery from my host and move on.
I’m going to add this to the H5BP comments, but I have been pretty convinced over the last week or so that the CDN is actually the best default state, but that there should continue to be more discussion about other ways that could offer better performance. It’s been a really interesting discussion and I appreciate the amount of thought that went into this request, it is impressive!
Billy Hoffman | 29-Mar-13 at 10:04 am | Permalink |
“If you’re loading core jQuery as a standalone request from your own server (which 38% of sites are doing), you’ll probably get an easy performance boost by switching to Google Hosted Libraries”
An interesting wild card here is SPDY. With SPDY, TCP connections are used more efficiently. If you are loading jQuery as a standalone request, its possible that loading it from your server via SPDY is faster than the overhead of a DNS lookup and making a TCP connection to download it from Google’s systems. Test test test to be sure on your own setup.
Bry | 04-Apr-13 at 9:24 pm | Permalink |
Steve,
It’s worth to note that the libraries from cdnjs from cloudflare tend to be faster.
Although, just as started using it for development their network was down for 1 hours last month blocking sites from rendering completely even when using google’s library as fallback with: window.jQuery || document.write()
One important question is: If and when the google CDN goes down completely as it did a couple times. Does the above fallback method even work?
I ask this because when their cdnjs was down, such fallback method did not kick in at all. The cloudflare request did not give up and made the sites completely unavailable.
A clarification on the reliability of this fallback method with google’s CDN as first choice (should that CDN be down) would be great.
Steve Souders | 04-Apr-13 at 10:02 pm | Permalink |
Bry: I’m really interested in this fallback method you mentioned. I’m not familiar with it and so am not the best person to debug it. It would be great if you could contact the people who created the snippet to get the answer, and then comment back here.
Bry | 12-Apr-13 at 1:09 am | Permalink |
Steve: The method was covered on StackOverflow with the main answer on this thread: http://stackoverflow.com/questions/1014203/best-way-to-use-googles-hosted-jquery-but-fall-back-to-my-hosted-library-on-go#comment5756925_4825258
However as per RedWolves’ comment there, it assumes that “if theGoogle ajax library is not available it’ll have to time out first before failing”.
In the case of cdnjs’s last downtime cdnjs did not ever timeout and kept spinning. So the Google fallback, as I had it setup during development (cdnjs 1st and google 2nd) was useless.
The timeout guarantees on Google network, in case of downtime, is really what would determines if such simple fallback would work or not. This speaks more to your side of this equation, as to wether the Google API network, DNS or what have you, is designed in such a way it would timeout (or wether it did in the past when it was temporarily down) in a downtime circumstance.
Because should none of the DNS libraries have any guarantee of quick timeout. It unfortuntely put the sites at the mercy of a possible unpredictable downtime. And the only safe way not to be subject to a third party downtime in this case, would be to use yet another script like labjs. But if we have to load yet another library first before jquery, it somewhat defeats the whole benefits of using a CDN for Jquery in the first place.
In other words, what I am looking for here, is an assessment as to the likeliness of Goggle’s API requests timing out in case of network issues.
Dolmen | 25-Apr-13 at 11:15 am | Permalink |
The question you should ask yourself is not about performance. It is about privacy.
Do you want to reveal the audience of your site to Google?
Do you want your users be traked by Google when they visit your site?
And so this is not just a technical question. The people in charge of the privacy questions must be involved in that decision.