Skip to content

Identification of spatiotemporal hotspots for taxis in New York City using Yellow Tax data for Jan'15

License

Notifications You must be signed in to change notification settings

chatla92/Hot-Spot-Analysis-Of-Large-Scale-Spatio-temporal-Data-Using-Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hot Spot Analysis Of Large Scale Spatio-temporal Data Using Spark

Input: A collection of New York City Yellow Cab taxi trip records for January 2015. The source data may be clipped to an envelope encompassing the five New York City boroughs in order to remove some of the noisy error data (e.g., latitude 40.5N – 40.9N, longitude 73.7W – 74.25W).
Output: A list of the fifty most significant hot spot cells in time and space as identified using the Getis-Ord statistic.

Methodology:

  • JavaPairRDDs are formed from the input data

input to JP

  • Calculate Getis-Ord statistic from above formed RDDs

z-stat calculate

Results:

The results for the analysis are present in Results

Heat Map of the results Heat Map

About

Identification of spatiotemporal hotspots for taxis in New York City using Yellow Tax data for Jan'15

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages