Fix slow performance for confusion matrix based metrics #1302

SkafteNicki · 2022-10-31T11:46:31Z

What does this PR do?

Fixes #1267
Fixes #1277
After the refactor the underlying _bincount function that does a lot of the computations for confusion matrix based metrics was changed. I did some testing at the time and everything seemed to be fine. However, for large inputs the
implementation is really slow.

Here is a direct comparison in colab:
https://colab.research.google.com/drive/18tGZj_dPria6NSwVOIgwPXO8mJBr21kc?usp=sharing

The results:

The TLDR:

Regardless of batch size the new implementation is approx 2x slower on cpu
For batch size 100 the new implementation is approx ~4x FASTER on gpu
For batch size 10.000 the new implementation is approx ~30x SLOWER on gpu

The simple fix currently in this PR is to change it back to the old implementation. The alternative is that we have something like this in the code:

def _bincount(x, minlength=minlength):
    ...
    if len(x) > N or not x.is_cuda:
        return torch.bincount(x, minlength=minlenght)
    else:
        z = torch.zeros(minlength, device=x.device, dtype=x.dtype)
        return z.index_add_(0, x, torch.ones_like(x))

where we have to set N based on some experimentation.
@Borda, @justusschock opinions?

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

Borda

LGTM 🎉

codecov · 2022-10-31T12:25:14Z

Codecov Report

Merging #1302 (fb3a387) into master (604ed80) will decrease coverage by 0%.
The diff coverage is 100%.

Additional details and impacted files

@@          Coverage Diff           @@
##           master   #1302   +/-   ##
======================================
- Coverage      87%     87%   -0%     
======================================
  Files         190     190           
  Lines       11121   11120    -1     
======================================
- Hits         9621    9620    -1     
  Misses       1500    1500

justusschock

@SkafteNicki For now, I think sticking to the default here makes sense. I would be open for changing this conditionally in the future, but this would require more experimenting :)

Borda · 2022-10-31T12:35:01Z

I would be open for changing this conditionally in the future, but this would require more experimenting :)

do you want to open an issue for it so we won't forget? 🦦

* bincount fix * chlog (cherry picked from commit 8cc0cd7)

Callidior · 2023-02-08T14:42:00Z

@SkafteNicki I stumbled upon this issue and did some investigation.

The new implementation used torch.Tensor.index_add_, which is known to be slow on integer tensors (see pytorch/pytorch#42109). That issue is open now for over 2 years, so a fix in the near future seems unlikely. However, the issue mentions using a float64 tensor as a work-around.

I ran a quick performance test with different tensor sizes.

Integer implementation:

z = torch.zeros(4, device=unique_mapping.device, dtype=unique_mapping.dtype)
return z.index_add_(0, unique_mapping, torch.ones_like(unique_mapping))

Float implementation:

z = torch.zeros(4, device=unique_mapping.device, dtype=torch.float64)
return z.index_add_(0, unique_mapping, torch.ones_like(unique_mapping, dtype=torch.float64))

Timing results on RTX 3090 (in milliseconds):

Number of Samples	Integer `index_add`	Float `index_add`	`torch.bincount`
1,000	0.3	0.1	0.2
100,000	144.7	0.1	0.2
10,000,000	18,497.1	9.3	4.2

Timing results on CPU (in milliseconds):

Number of Samples	Integer `index_add`	Float `index_add`	`torch.bincount`
1,000	0.20	0.03	0.01
100,000	0.48	0.39	0.19
10,000,000	18.0	16.3	8.4

Looks like torch.bincount is the best option, aside from other issues with it like #1413.

bincount fix

9ff0fce

SkafteNicki added the bug / fix Something isn't working label Oct 31, 2022

SkafteNicki added this to the v0.10 milestone Oct 31, 2022

SkafteNicki requested review from Borda, justusschock, tchaton and ethanwharris as code owners October 31, 2022 11:46

Borda approved these changes Oct 31, 2022

View reviewed changes

chlog

a993841

Borda enabled auto-merge (squash) October 31, 2022 12:13

justusschock approved these changes Oct 31, 2022

View reviewed changes

Merge branch 'master' into bugfix/slow_bincount

6da33de

mergify bot added the has conflicts label Oct 31, 2022

Merge branch 'master' into bugfix/slow_bincount

fb3a387

mergify bot added ready and removed has conflicts labels Oct 31, 2022

stancld approved these changes Oct 31, 2022

View reviewed changes

Borda disabled auto-merge October 31, 2022 19:10

Borda merged commit 8cc0cd7 into master Oct 31, 2022

Borda deleted the bugfix/slow_bincount branch October 31, 2022 19:11

Borda pushed a commit that referenced this pull request Oct 31, 2022

Fix slow performance for confusion matrix based metrics (#1302)

558c61c

* bincount fix * chlog (cherry picked from commit 8cc0cd7)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix slow performance for confusion matrix based metrics #1302

Fix slow performance for confusion matrix based metrics #1302

SkafteNicki commented Oct 31, 2022 •

edited

Loading

Borda left a comment

codecov bot commented Oct 31, 2022 •

edited

Loading

justusschock left a comment

Borda commented Oct 31, 2022

Callidior commented Feb 8, 2023

Fix slow performance for confusion matrix based metrics #1302

Fix slow performance for confusion matrix based metrics #1302

Conversation

SkafteNicki commented Oct 31, 2022 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

Borda left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 31, 2022 • edited Loading

Codecov Report

justusschock left a comment

Choose a reason for hiding this comment

Borda commented Oct 31, 2022

Callidior commented Feb 8, 2023

SkafteNicki commented Oct 31, 2022 •

edited

Loading

codecov bot commented Oct 31, 2022 •

edited

Loading