One sees "Validation and test accuracy" everywhere as a metric on how well does the model perform. However, this is leaving out a substantial part of the information.
Four Outcomes of Binary Classification:
- True positives: data points labeled as positive that are actually positive.
- False positives: data points labeled as positive that are actually negative.
- True negatives: data points labeled as negative that are actually negative.
- False negatives: data points labeled as negative that are actually positive.
The most important metrics are:
- precision = True positives / (True positives + False Positives)
- recall/Sensitivity = True positives / (True positives + False negatives)
- F1 = 2*precision*recall/(precision+recall)
Visualizing Recall and Precision
- Confusion matrix: shows the actual and predicted labels from a classification problem
- Receiver operating characteristic (ROC) curve: plots the true positive rate (TPR) versus the false positive rate (FPR) as a function of the model’s threshold for classifying a positive
- Area under the curve (AUC): metric to calculate the overall performance of a classification model based on area under the ROC curve