We created a report that includes what cryptocurrencies are on the trading market and how they could be grouped to create a classification system for this new investment.The data Martha provided us was not ideal, so we processed to fit the machine learning models. Since there is no known output for what Martha is looking for, we decided to use unsupervised learning. To group the cryptocurrencies, Martha and us decided on a clustering algorithm. We used data visualizations to share our findings with the board.
Crypto_data.csv
Software: Python 3.7.7, Anaconda Navigator 1.9.12, Conda 4.8.4, Jupyter Notebook 6.0.3
- Preprocessing the Data for PCA: preprocessed the dataset in order to perform PCA
- Reducing Data Dimensions Using PCA: Applied the Principal Component Analysis (PCA) algorithm, and reduced the dimensions of the X DataFrame to three principal components and place these dimensions in a new DataFrame.
- Clustering Cryptocurrencies Using K-means: The K-means algorithm is used to cluster the cryptocurrencies using the PCA data. Ran the K-means algorithm to predict the K clusters for the cryptocurrencies’ data.
- Visualizing Cryptocurrencies Results: creating scatter plots with Plotly Express and hvplot,create a table with all the currently tradable cryptocurrencies using the hvplot.table() function.
There are 532 tradable cryptocurrencies. In Unsupervised learnings, a dataset is provied wuthoutlables, and a model learns useful properties of the structure of the dataset. Compared to supervised learning models, the unsupervised learnings have less information about eh data. The unsupervised learnings involves, grouping similar examples together, in above case similar cryptocurrencies, It also involves dimentionality reduction, and density estimation.