This project uses PySpark and RDD to run SQL queries in parallel. The SQL queries are run for the purpose of data analysis. The plots may not be visible in the GitHub notebook viewer.
To view the plots, download the html file and view it statically or download the notebook and view it in a notebook environment where plotly is supported.