Group project focused on analysing New York Taxi Data via PySpark (receiving 20/20 points)
- Finding out where to put up bus routes
- Multinomial logistic regression to classify into no tips / low tips / high tips
- Helping taxi drivers where in the city they should go next
- K-means clustering to find out where to put taxi stands
- Page rank algorithm to find important traffic nodes