The task at hand is to detect hotspots for Uber pickup in NY to better help drivers be where they are needed.
The dataset is obtained from the Uber Pickups in New York City dataset on Kaggle, and focuses on the month of May 2014.
Ouliers are initially removed using DBScan. Hotspots are then determined using KMeans on 9 centroids. The ideal number of centroids is based on the evaluation of the silhoutte score and inertia score of KMeans models train on different number of clusters. The clusters are determined on an hour-by-hour and day-by-day basis.
Clusters are displayed on an interactive Plotly graph