Analyzing NYC 2013 taxi data

It all started after I saw this post on Hacker news :
Thanks to Chris Wong for foiling the data.
Off-topic: Just checked his site and found that he has foiled another data. Awesome!

Back to the org topic, it’s a HUGE dataset. 173 million records!

Inspired from Chris’s work, I decided to give it a try and created a single page web application. This application will show a heat map visualization of top pickup and dropoff locations in NYC. Currently I have divided the city into its five boroughs -> Manhattan, Brooklyn, Staten Island, Queens and Bronx. Each showing its top most frequented locations.

I will share my work here by dividing it into different parts:
1) Preparing the dataset using MongoDB
2) Creating Map-Reduce
3) Preparing Geo-spatial queries
4) Using Node.JS to provide a REST interface
5) Finally Backbone.JS to create the single page application

Following is the GitHub page:
You can have a look at the Technology Stack here:

Note: The articles are not beginner articles. It expects some knowledge of Backbone.JS and MongoDB.