Semantic Segmentation - Multiclass Classification - Implementation for per Class Accuracy mixed up with per Class Recall? #4861

biggeR-data · 2023-03-16T11:22:36Z

biggeR-data
Mar 16, 2023

Hey everyone, I am using Detectron2 with a custom dataset for semantic segmentation. My dataset contains multiple classes so it is not a binary classification problem.

I checked the implementation details for the evaluation metrics in the script detectron2/evaluation/sem_seg_evaluation.py. There I stumbled across the per class accuracy implementation which seems to be confused with the per class recall.

Here're the code parts in question:

acc = np.full(self._num_classes, np.nan, dtype=np.float)
# ...
tp = self._conf_matrix.diagonal()[:-1].astype(np.float)
pos_gt = np.sum(self._conf_matrix[:-1, :-1], axis=0).astype(np.float)
# ...
acc_valid = pos_gt > 0
acc[acc_valid] = tp[acc_valid] / pos_gt[acc_valid]

Judging by the naming of the variables pos_gt represents the Ground Truths / Actuals. This means the Actuals are in the columns of the confusion matrix and the Predictions are in the rows of the confusion matrix.

Note: This notation differs in orientation from the Wikipedia Definition of a Confusion Matrix. If you want to calculate Metrics according to the Wikipedia layout you would need to transpose the Confusion Matrix given in the evaluation script.

I will stick to the orientation provided by detectron2 with my following examples to avoid confusion.

Looking at the code the Accuracy per class is calculated by dividing TP by the Actual Positives (named P in Wikipedia's entry). This does not correspond to the definition of the Accuracy measure. The Accuracy is defined as:

(TP + TN) / n

also known by this Formula:

(TP + TN) / (TP + FP + TN + FN)

The Recall is defined as:

TP / (TP + FN) = TP / P

which is exactly the formula used to calculate the per class 'accuracy' in Line 193.

I have searched for articles covering Multiclass Classification where per class accuracy and per class
recall are covered however the sources for this are rather scarce. I did find a comment on Stackoverflow claiming per class accuracy and per class recall are the same for multiclass classification. On the other hand I found an example where I continued the given example and arrived at the conclusion that per class accuracy is in fact not the same as per class recall.

Please refer to this screenshot of the continuation of the example:

So I guess my question is:
Is the current implementation for per class accuracy in sem_seg_evaluation.py mixed up with per class recall given a multiclass classification problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic Segmentation - Multiclass Classification - Implementation for per Class Accuracy mixed up with per Class Recall? #4861

{{title}}

Replies: 0 comments

Select a reply

Semantic Segmentation - Multiclass Classification - Implementation for per Class Accuracy mixed up with per Class Recall? #4861

biggeR-data Mar 16, 2023

Replies: 0 comments

biggeR-data
Mar 16, 2023