Comparison of Models using NASA Kepler data
'koi_disposition' - The disposition in the literature towards this exoplanet candidate. One of CANDIDATE, FALSE POSITIVE, NOT DISPOSITIONED or CONFIRMED (i.e., likelihood that a given exoplanet is a true planet)
- Preprocess the dataset prior to fitting the model.
- Perform feature selection and remove unnecessary features.
- Use
MinMaxScaler
to scale the numerical data. - Separates the data into training and testing data with
TrainTestSplit
.
- Decision Tree - uses
GridSearch
to tune model parameters. [K-Nearest Neighbors, Support Vector Machine, Recurrent Neural Network - Keras/Tensorflow]
1 . Decision tree ensemble (random forest) points to solar mass centroid offset of exoplanet mass as strongest predictor of planetary viability.