What is a Diabetes ?
Diabetes is a chronic disease that occurs when the pancreas is no longer able to make insulin, or when the body cannot make good use of the insulin it produces.
What are we predicting ?
Diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements.
Algorithms Used :
Support Vector Machine (SVM)
k-nearest neighbors (KNN)
XGBoost
Why SVM ?
SVM models have generalization in practice, the risk of over-fitting is less in SVM.
It scales relatively well to high dimensional data.
Works well with the imbalanced data.
Why KNN ?
No assumptions about data.
Simple algorithm — to explain and understand/interpret.
Versatile — useful for classification.
Why XGBoost ?
Offers several advanced features for model tuning, computing environments and algorithm enhancement.
It efficiently reduce computing time and allocate an optimal usage of memory resources.
Important features of implementation include handling of missing values.
TUNING AND WHY ?
TUNED MODEL: XGBOOST
WHY ? : It overcomes all the drawbacks related with KNN and SVM.
Works great with big datasets.
Has the regularization technique to avoid overfitting.