Tuesday, March 26, 2013

Backbone.JS and SEO: Google Ajax Crawling Scheme


Most search engines hate client side MVC, but luckily there's a few tools around to get your client side routes indexed.

As most web bots (i.e. Google and others) don't interpret javascript on the fly, they fail to parse   javascript rendered content for indexing. To overcome this, Google (and now Bing) support the 'Google Ajax Crawling Scheme' (https://developers.google.com/webmasters/ajax-crawling/docs/getting-started) - which basically states that IF you want js rendered DOM content to be indexed (i.e. rendering ajax call results), you must be able to:
  1. Trigger a page state (javascript rendering) via the url using hashbangs #! (i.e.http://www.mysite.com/#!my-state), and
  2. Be able to serve a rendered dom snapshot of your site AFTER javascript modification on request.
If using a client side MVC framework like Backbone.js, or simply have a javascript heavy page, and wish to get its various states indexed - you will need to provide this dom snapshotting service server side if you want your web app indexed. Typically this is done using a headless browser (i.e. QT, PhantomJS, Zombie.JS, HtmlUnit).

For those using ruby server side, there's a gem which already handles this called google_ajax_crawler available on rubygems.

gem install google_ajax_crawler


Its used as rack middleware and essentially intercepts a request made by a web bot adhering to the scheme, and scrapes your site server side then delivers the rendered dom back to the requesting bot as a snapshot.

A simple rack app example demonstrating how to configure and use the gem:

No comments: