All source code and images are associated with the paper:
"Machine-Learning Predictions of Critical Temperatures from Chemical Compositions of Superconductors"
By S. G. Jung, G. Jung & J. M. Cole
The description of each file is summarized below:
(i) create_material_objects.py
Script to create Pymatgen material objects, which is required before featurizers.py
(ii) featurizers.py
Script to create features from Pymatgen Material objects (e.g. composition-based & structure-based).
(iii) featurize_custom.py
Script to create various chemical and symmetry-based features using custom functions
(iv) feature_analyses.py
Script to perform statistical feature analyses
(v) feature_engineering.py
Script to perform feature engineering. By default, a brute-force method is used.
(vi) GBFS.py
Script to perform gradient boosted feature selection, generate feature ranking, and carry out recursive feature selection. See "Gradient Boosted and Statistical Feature Selection" in https://github.com/Songyosk/GBSFS4MPP.
(vii) Multicollinearity_reduction.py
Script to perform multicollinearity reduction, which includes correlation analysis and hierarchical clustering analysis. Correlation and linkage thresholds are defined to elminate features.
(viii) permutation_importance.py
Script to analyze and identify features that are importance when permutation is performed
(ix) recursive_feature_elimination.py
Script to perform recursive feature elimination
(x) optimization.py
Script to perform Bayesian optimization, which determines the architecture of the predictive model based on a defined hyperparameter space.
(xi) utilities.py and read_json.py
Scripts containing helper functions
J.M.C. is grateful for the BASF/Royal Academy of Engineering Research Chair in Data-Driven Molecular Engineering of Functional Materials, which is partly sponsored by the Science and Technology Facilities Council (STFC) via the ISIS Neutron and Muon Source; this Chair is supported by a PhD studentship (for S.G.J.). STFC is also thanked for a PhD studentship that is sponsored by its Scientific Computing Department (for G.J.).