Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why you should let Google host jQuery for you (encosia.com)
76 points by jwilliams on Dec 11, 2008 | hide | past | favorite | 61 comments


For folks interested in his #2 benefit (using multiple domains causes static assets to be loaded in parallel, speeding up observed speeding page load times) -- you can capture it without involving Google in the slightest. It requires a perhaps two-line tweak to your web server configuration files to make it answer requests on domains other than www.example.com

For example, my main domain is www.bingocardcreator.com . There are some honking images which load on the front page that used to block the (more valuable to my conversion pathway) images on the sidebar from popping up. I reconfigured my webserver to accept images1 through images4 as acceptable host names, and (lazily) manually split the static assets over the 5 hosts.

This actually resulted in improving the loading of my sidebar by upwards of a second. That is worth, oh, probably $100 a month or so. (If you do not know how I can possibly say this, you owe it to yourself to try CrazyEgg right now.)

Not bad for an hour's work that never has to get repeated.

Credit given where credit is due: I found this tip out through using Yahoo's YSlow plugin for Firefox, and following their "Best Practices for Speeding Up Your Website". This document is one of the best practical "Here, do this, it will make you money" sources I have ever found on the Internet.

http://developer.yahoo.com/performance/rules.html


This is a tip that I've also used to great effect, but I thought that I should point out (if it isn't obvious) that you must make sure the same image always points to the same subdomain (image 1 is always linked to images1.mysite.com) otherwise you'll shoot yourself in the foot by eliminating browser caching.

I usually write a quick function in $language to turn the first few letters of the filename into ASCII then modulus by the number of fake subdomains that I've set up to decide which server to link to. The overall distribution may be imperfect but file x will always be linked to server x across pages, so it's a nice quick solution, especially if you have multiple maintainers.


I've thought about this, but I decided against it for one simple reason: if google is down my site is in trouble. By serving it locally it's all or nothing, either your site works (because it's up), or it doesn't.

But if the site is up, but google is down it's will do really bad stuff.

Plus I don't really understand the multiple connections issue - what about HTTP/1.1 and keep alive? Doesn't it download everything all in one go, but only if it's all one server?

And finally about the caching issue - sure the first time might not be cached, but most sites rely on people visiting more than once, or at least clicking on more than one page, so it'll be cached the rest of the time after that.

So none of the arguments are really holding with me.


HTTP 1.1 helps streamline multiple downloads in sequence, but doesn't help with the issue of concurrent download limits imposed by the browser. Loading a few complex pages while watching something like Firebug's Net panel helps make that clear.

You'd be surprised about how many users hit your site with less cached than you'd hope: http://yuiblog.com/blog/2007/01/04/performance-research-part...


How often is Google down? Is it worth worrying about?

(questions are 80% rhetorical, 20% genuine curiosity)


Google doesn't usually go down, but routing issues or DNS problems mean that it does happen that they are unreachable.


How often are there routing issues to Google, but none to your site? It's much more likely that your site will be unreachable. The probability that Google will be down is probably insignificant for most cases.


I have several times been unable to reach Google but have been able to reach my own site.

It did not happen for very long (a few minutes each time), and not often, but it has happened.


Last week I found an issue with routing internal to Yahoo's core network that was causing an esoteric DNS failure for our customers.

Routers are configured by people, and people make mistakes.


Yes, but what does that have to do with Google's reliability? What does it have to do with routing issues between one particular site and Google?

He's not claiming that Google is infallible. I'm more confident in their ability to maintain availability than yours. All of this paranoia about Google going down is mostly FUD.


Well, depending on multiple sites for critical data of course multiplies the risk of failure. But the risk of depending on one other highly reliable site for something that should almost always be cached anyway is indeed very very small.

I wouldn't say this whole issue is FUD though. For instance I would not want to depend on four different even high quality third party sites for each request to my website because the likelyhood of failure would be (almost) five times as high.


You missed the point, which was that your site could be up, and google would be down, which would cause big problems with javascript errors.

OTOH if your site is down, it's fully down - no partial content, and javascript errors.


I understand your point and I'm dismissing it as FUD because you're playing the "what if" game. Comeback with some uptime graphs. I don't believe it's worthwhile to worry about Google's uptime and it's perfectly legitimate to let them host jQuery for you until proven otherwise.


Does someone know if there is a way to provide an alternate source for a script in case of failure of the main source ?


you could probably include the script multiple times from different places, so long as you're not doing something within it as it loads (i.e. it's just functions)

I don't know enough about JS to say, but you might be able to do if(!$loadedvar){document.whatever.write("<script src=\"...\" />"); }; as well


I don't want to provide user stats for my site to Google for free.

Granted, if the user has the library already cached, I suppose Google sees nothing, but still.

Maybe HTML should be extended so that together with <link> tags you can state a hash of the content, so that the browser could recognize identical files. (Would be a bit dangerous, though).


Chances are, they're not tracking it. In fact, I'd be extremely surprised if they were.

You can use YUI hosted by Yahoo's CDN. We don't track those statistics because the data is simply overwhelmingly large and not very useful even if you could parse it. The goal is speed there, not data.

And, as you mention, many users will have it cached. Their cache headers invalidate the data anyhow, making that data not just expensive to gather, but completely worthless.


Why is the data not useful? If any content comes from Google, I suppose the Google cookie comes with it. And then you have a history of pages a user visits. Doesn't Yahoo do the same (only with a Yahoo cookie)?


No. First of all, the YUI data is served from yahooapis.net, not yahoo.com. Second, every effort is made to keep the request as small as possible, and cached for as long as possible.

1. A cookie would be a waste of bytes.

2. Since the headers instruct your browser to cache it for 10 years, it's quite likely that you won't even hit their server on the second page that uses this file.

Don't believe me? Check for yourself:

    curl -I http://yui.yahooapis.com/2.6.0/build/yahoo/yahoo-min.js


I have no cookies for googleapis.com, so no the Google cookie does not come with it. It's just the JS libs.


Even more useful than that, they get traffic data! I'd imagine traffic data has a heavy influence on page rank...


They don't get traffic data, though. They send a far-future expires header, so they don't even expect that the same user will request it more than once, unless their browser hits the limit on its cache size and deletes the file.

The goal is speed, not tracking.


PageRank is simply a measure of how many pages link to your pages, weighted by their own PageRank. Traffic has nothing (directly) to do with it.


I said "page rank", not "PageRank". When you perform a web search, you get a _ranked_ list of pages...


It's not like you don't already use Google Analytics, right?


I don't use Analytics, for the same reason. It is also protecting the privacy of my users.


Google Analytics really can slow down page loads in my experience.


This isn't a good idea.

I use noscript and I rarely will let it load 3rd-party Javascript. The more times I have to click "temporarily allow ..." the more i hate your site. As a user, I don't like every site telling Google what I'm doing. Google is becoming the panopticon of the internet. Forcing users to load content third-party content is a sign that you don't respect their privacy.

IE8 in private browsing mode will block 3rd-party scripts that are referenced from more than 10 (by default) sites. This includes all this Google stuff and Google Analytics.


How many users have even heard of NoScript or know what "3rd-party Javascript" is?

They just care that your page works and loads fast.

You are a vanishingly small minority.


I think you're a statistical anomaly.


As we serve local, county, and state governments using a subscription model, we don't run into the problem of random visitors with noscript enabled when viewing the site, but I can see how that would affect other sites.

As a side note, you can tie the CDN you create with Google App Engine to a domain, i.e. cdn.yoursite.com grabs from the App engine CDN.


Out of curiosity, do you purchase subscriptions to digital products or services on the Internet?


"IE8 in private browsing mode will block 3rd-party scripts that are referenced from more than 10 (by default) sites. This includes all this Google stuff and Google Analytics."

Is changing the URL slightly enough to circumvent this?


The cool idea in this is that universal libraries like jquery basically become locally cached...no download needed even if it is the first time they hit your site.


It would be even cooler if this were done by comparing a hash of the library's remote and local copy, so it could be on any server and not bound to the same domain.


I hope it is a cryptographically secure hash that STAYS cryptographically secure. Otherwise ten years down the road someone will figure out a way to force collisions and then, bam, you'll be running l33tcrew.ru's copy of Prototype with the access privileges of bankofamerica.com .

No problem fixing that one, all you have to do is roll out a patch to every standards-compliant browser in the world, oh, yesterday would have been good.


Or specify the hash type you'd like to see at the point you invoke the script. It can't retroactively protect pages that were published with the insecure hash, but it certainly would allow the author to update their site upon awareness of a hash vulnerability.


Sure, if JQuery's the only JavaScript file on the page. But if you use multiple plugins that depend upon JQuery, you're usually better off concatenating them all into one file, minifying that, and GZipping the content. You'll get slightly better compression than if you serve them separately, and don't need to worry about multiple HTTP requests.


Yes but you lose the benefit of the cache of jquery.min.js some people may already have when loading the first page of your site (the most important time for performance)


True, but they're not going to hit the cache anyway for the plugins, so download time will still be slow.


If you're already serving your other assets from a separate domain (which is a very good idea - and if you use a separate domain rather than a subdomain you can be sure you won't have any cookies included in incoming HTTP requests, speeding things up even further) then it might be better to serve jQuery from your own asset server, to save the additional DNS lookup. Worth benchmarking, anyway.


And extension of this theory is hosting your static content on Google App Engine. (aka cheap CDN for initial stage of a startup!)


I have been using Google App Engine as a CDN to serve up images and some core js, as well as using Google Ajax Library to serve up MooTools for about 5 or 6 months now. It has worked out really really well. In our case, we needed to serve our clients with our in house servers running ESRI's ArcGIS server. Splitting the load between Google and our Arc servers cut the load time in half for some cases.


YUI has been doing the same thing (hosting their JS professionally), and it does really make sense.

We've been doing this for years at fluther.com and it's worked out great. We have a switch for using a local copy when we're offline (on an airplane), but in general it adds speed and saves bandwidth.

This was always a nice benefit of YUI, and I'm glad to see there's an option with JQuery, too. (I'm planning to switch at some point--we went with YUI in the first place because it had the only stable autocomplete plugin but that's long since changed.)


1 reason why not to: loss of control.

Perhaps I'm being paranoid, but I do not want to rely on ANY external providers - not even when it's Google.


Do you provide your own internet bandwidth? You have to rely on external providers at some point and in all likelihood Google is better at keeping the lights on than you are.


I've never really understood that argument. Assuming that your site is totally down when your server goes down and that you rely on jquery so heavily that loosing it essentially renders your site unusable it doesn't seem to matter how much better google is at delivering content than you are. If you can manage an uptime greater than 50% the likelihood is that your server and google will be down at different times so the downtime is additive.


I think the argument is that google is an order of magnitude or so better at downtime than you are, so any google-created downtime is lost in the noise.


True, Google is probably better at keeping their servers up... but I prefer to reduce my external dependencies anyway.

Right now, I only have to rely on my hosting partner - which means I'm dependent on just ONE party. If something goes wrong, I call 'em and kick their lazy asses :) Chances are that things are up and running in no time.

Remember when Amazon's cloud storage went offline last July? This meant all kinds of websites relying on that service were offline as well. No-one knew what was happening, why it was happening, and when it would be back online. I don't need that - seriously :)


If course you could mirror all your resources locally and quickly switch over if Google were to go down.


If you're really that paranoid, monitor and verify. If there are issues, you can change hosting optins. You don't mitigate risk by doing everything yourself.


What? How are you supposed to monitor and verify that google is up from the point of view of the client?

If you do it locally it's either fully working, or fully down, and you don't have javascript errors when parts work and parts don't.


If you're worried about Google's reliability, you could always test for typeof(jQuery) !== 'undefined' before assuming it had loaded. Then, you can inject a script element referencing a copy of jQuery on your server if it hadn't.

In practice, I would think the chances of Google's entire CDN going down are the very least of your worries though.


That's quite an interesting idea, but I'd be worried about it running too soon, the browser doesn't give you a "gave up on downloading something" event.

Google goes down more than you think - it's not usually that google itself is down, but rather that DNS didn't look them up, or some anoying routing issue.

It's happened to me, so I know it does happen.


This is a terrible idea if the security of your site matters, because you're loading Javascript off an untrusted, unsecure source.


Dude... it's Google


NO IT ISN'T. It's any of the computers between you and Google that influence where your unencrypted HTTP connection is going to land.


Ah I see what you're saying. Good point sir.


I think it's unlikely you'll see increased parallelism usually. All but the newest browsers (IE8, Safari 4, FF 3.1) wait until they encounter script tags before they download them, rather than looking ahead to download multiple scripts in parallel.

http://stevesouders.com/ua/


and CloudFront the other JS libraries your site must use.. http://paulstamatiou.com/2008/12/08/how-to-getting-started-w...


Answer: it's faster




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: