Exploring New York City taxi trails and sharing our way to a more sustainable urban future

With an ever-increasing trove of real-time urban data streams, we are able to see precisely where, how, and at what times different parts of our cities become stitched together as hubs of mobility. By using these pervasive, interconnected, and "smart" technologies, we can begin to unravel the complexity of our travel patterns and identify how we can reduce the social and environmental costs embedded in our transportation systems. In HubCab we target taxicab services as a way to understand the linkages between our travel habits and the places we travel to and from most often.

Download hi-res video

HubCab is an interactive visualization that invites you to explore the ways in which over 170 million taxi trips connect the City of New York in a given year. This interface provides a unique insight into the inner workings of the city from the previously invisible perspective of the taxi system with a never before seen granularity. HubCab allows to investigate exactly how and when taxis pick up or drop off individuals and to identify zones of condensed pickup and dropoff activities. It allows you to navigate to the places where your taxi trips start and end and to discover how many other people in your area follow the same travel patterns. What do these visualizations tell us about collective mobility? How many of these cabs might you have been able to share with the people around you? And how might entertaining these questions be the first step in building a more efficient and cheaper taxi service?

The Science of Sharing

The HubCab tool expands and changes the perception of urban space using a large-scale data set. Studying this data, we show in a scientific study [1] the vast potential of taxi shareability. Our analysis introduces the novel concept of "shareability networks" that allows for efficient modeling and optimization of the trip-sharing opportunities. This mathematical approach makes use of network densification effects and represents a substantial advance over the existing state-of-the-art solutions to social sharing problems. Significant improvements of such a shared system are expected to lead to less congestion in road traffic, less running costs and split fares, and to a less polluted, cleaner environment [2].

The sharing benefits displayed on the map refer to total fare fare savings to passengers, distance distance savings in travelled miles, and co2 emission savings in kg of CO2 that come from potentially shared trips. Our research [1] shows that taxi sharing could reduce the number of trips by 40% with only minimal inconvenience to the passengers. Here we assume this 40% shareability rate, together with the following highly simplifying assumptions: A fare of 3.00$ + 2.50$/mi [3], using Rate Code 1 not accounting for low motion fares or special surcharges, and average CO2 emissions of 423g/mi [4]. Traveled distance is simplified as linear distance.

[1] P. Santi, G. Resta, M. Szell, S. Sobolevsky, S. Strogatz, C. Ratti. Taxi pooling in New York City: a network-based approach to social sharing problems (2013)
[2] M. Szell, B. Groß. Hubcab - Taxi-Fahrgemeinschaften, digital erkundet. Die Stadt entschlüsseln, Bauwelt Fundamente, Birkhäuser. Eds: D. Offenhuber, C. Ratti (2013)
[3] NYC Taxi & Limousine Commision. Taxi Rate of Fare
[4] U.S. Environmental Protection Agency. Greenhouse Gas Emissions from a Typical Passenger Vehicle

For updates on HubCab subscribe to the SENSEable City Lab newsletter.

Get started

Press Downloads

Download press release
Download visual material
Download hi-res video

The material on this web site can be used freely in any publication provided that
1. it is duly credited as a project by the MIT Senseable City Lab
2. a PDF copy of the publication is sent to senseable-contacts@mit.edu


Screenshot of HubCab, showing pickups and drop offs of all 170 million taxi trips over one year in New York City.

Screenshot of HubCab, showing taxi flows and potential taxi sharing benefits between two locations in Manhattan.


Screenshot of HubCab, highlighting all taxi dropoff points in New York City of passengers who were picked up at Times Square daily between 12 PM and 3 PM.

Screenshot of HubCab, showing all taxi pickups and drop offs at JFK airport daily between 3AM and 6AM.


Technical Development

The basis of the HubCab tool is a data set of over 170 million taxi trips of all 13,500 Medallion taxis in New York City in 2011. The data set contains GPS coordinates of all pickup and drop off points and corresponding times.

Cartographic data of street shapes were obtained from OpenStreetMap. The streets were cut into over 200,000 street segments of 40m length each with a Python script and the help of the shapely Python library, and imported into a MongoDB. Pickup and drop off points were matched to the closest street segments. Street types unlikely to contain taxi drop offs or pickups, such as footpaths, trunks, service roads, etc. were not used in the matching process. Line widths of yellow and blue street segments on low zoom levels were styled on a logarithmic scale. The pickup and drop off points, represented as dots on the high zoom levels, were generated via an Arcpy script, being placed randomly within a box around a given street segment with the box width again following a logarithmic scale. GPX files of the dots were styled using Maperitive, then merged and amended for different zoom levels. The dots and street line files were layered together with MapBox, which is the platform that streams all the map content.

The data back end of HubCab runs on a MongoDB, containing all street segments and their coordinates, and all flows between each pair of street segments. The number of all possible street segment pairs is over 40 billion (200,000 times 200,000) per map. Radius selection is dynamic, using MongoDB's $near function to obtain flows from all segments within the radius of the pickup marker to all segments within the radius of the drop off marker. With nine maps (one for the yearly data, eight for 3-hour time segments on all Fridays/Saturdays) and three selectable radii, there is a total of over one trillion flow combinations that can be explored with HubCab. Communication between MongoDB and the front end is realized via PHP scripts and Javascript+JSONP.