This project is about retrieving mineral data from OpenMindat API and cleansing the data into our designed data structure for the 3D heat map to analyze co-relationships between minerals and elements.
For users interested in using Mindat API, please refer to https://github.com/ChuBL/How-to-Use-Mindat-API.
You will need an api_key.txt
file in the root path of your cloned repository to run the codes. The API key will not be included in this repository. Please reach out to Mindat administrators for help.
The whole data stream is wrapped in mindat_data_processor.py
. You can walk through all the steps from data retrieving to data export by running this single .py
file.
The retrieved data are saved in ./mindat_data/raw_data
, in the naming format of mindat_items_IMA_00000000000000.json
.
The exported CSV files are saved in ./mindat_data/csv/
Under this directory we provided 8 generated datasets derived from OpenMindat IMA-approved mineral species.
-
30_elements.csv
Elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of 30 top frequency elements in Mindat attributeelements
. -
30_sigelements.csv
Elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of 30 top frequency elements in Mindat attributesigelements
. -
normalized_30_elements.csv
Normalized elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of 30 top frequency elements in Mindat attributeelements
. -
normalized_30_sigelements.csv
Normalized elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of 30 top frequency elements in Mindat attributesigelements
.
-
73_elements.csv
Elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of all elements in Mindat attributeelements
. -
73_sigelements.csv
Elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of all elements in Mindat attributesigelements
. -
normalized_73_elements.csv
Normalized elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of all elements in Mindat attributeelements
. -
normalized_73_sigelements.csv
Normalized elements cooccurrence 3D matrix, comprises of as a concatenated 2D matrices of all elements in Mindat attributesigelements
.
mindat_api.py
for retrieving data from Mindat api.
csv_normalizer.py
for generating normalized version of the cleaned CSV file.