A Jupyter notebook for Exploratory Data Analysis using machine learning techniques. This was made as an exercise for GLY6932 (Data Science and Machine Learning Methods in the Geosciences) at University of Florida.
The purpose of this notebook is to explore which factors may be controlling the formation of ε-Fe2O3 in North American Clinker Deposits. To do this, I used machine learning techniques (Principal component analysis, K-means clustering, and a random forest classifier) via the scikit-learn library.
See an interactive version of the notebook below:
Simply launch the binder (be patient while it loads), select the notebook, and run the cells. Markdown has been added to the notebbok to walk you through the results.
Alternatively, you can view the code and outputs of the notebook here on GitHub.
The data used in this exercise is a combination of previously published data (Sprain et al., 2021) and unpublished data. The unpublished data is part of a manuscript that is currently in preparation (unrelated to this exercise).