Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAP metric, fix metric for CUDA execution #673

Merged
merged 8 commits into from
Dec 15, 2021

Conversation

tkupek
Copy link
Contributor

@tkupek tkupek commented Dec 10, 2021

What does this PR do?

Fixes an issue where the new MAP implementation cannot be executed on CUDA devices.
The tensors have to be initialized/moved to the correct CUDA device.
Fixes #671

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

🎉

@codecov
Copy link

codecov bot commented Dec 10, 2021

Codecov Report

Merging #673 (b0a798e) into master (01f88fe) will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master   #673   +/-   ##
=====================================
  Coverage      95%    95%           
=====================================
  Files         166    166           
  Lines        6377   6379    +2     
=====================================
+ Hits         6070   6074    +4     
+ Misses        307    305    -2     

Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wasn't this caught in our test? We do have CUDA tests

@tkupek
Copy link
Contributor Author

tkupek commented Dec 10, 2021

@justusschock Good question. I don't have a special setup. Just DDP mode with a single GPU.

@Borda Borda added the Priority Critical task/issue label Dec 10, 2021
@Borda Borda added this to the v0.6 milestone Dec 10, 2021
@justusschock
Copy link
Member

@tupek could you try to include the necessary changes to the tests in this PR?

@tkupek
Copy link
Contributor Author

tkupek commented Dec 10, 2021

Discussed this already in Slack. @twsl you found the issue with the tests did you?

@Bunoviske
Copy link

Bunoviske commented Dec 12, 2021

When executing MAP.compute() on CUDA device, my error was different from #671. I got the following with torchmetrics==0.6.1 and torch==1.10.0:

/usr/local/lib/python3.7/dist-packages/torchmetrics/metric.py in wrapped_func(*args, **kwargs)
    370                 dist_sync_fn=self.dist_sync_fn, should_sync=self._to_sync, should_unsync=self._should_unsync
    371             ):
--> 372                 self._computed = compute(*args, **kwargs)
    373 
    374             return self._computed

/usr/local/lib/python3.7/dist-packages/torchmetrics/detection/map.py in compute(self)
    666             - mar_100_per_class: ``torch.Tensor`` (-1 if class metrics are disabled)
    667         """
--> 668         overall, map, mar = self._calculate(self._get_classes())
    669 
    670         map_per_class_values: Tensor = Tensor([-1])

/usr/local/lib/python3.7/dist-packages/torchmetrics/detection/map.py in _calculate(self, class_ids)
    513         eval_imgs = [
    514             self._evaluate_image(img_id, class_id, area, max_detections, ious)
--> 515             for class_id in class_ids
    516             for area in area_ranges
    517             for img_id in img_ids

/usr/local/lib/python3.7/dist-packages/torchmetrics/detection/map.py in <listcomp>(.0)
    515             for class_id in class_ids
    516             for area in area_ranges
--> 517             for img_id in img_ids
    518         ]
    519 

/usr/local/lib/python3.7/dist-packages/torchmetrics/detection/map.py in _evaluate_image(self, id, class_id, area_range, max_det, ious)
    372 
    373         # sort dt highest score first, sort gt ignore last
--> 374         ignore_area_sorted, gtind = torch.sort(ignore_area)
    375         gt = gt[gtind]
    376         scores = self.detection_scores[id]

RuntimeError: Sort currently does not support bool dtype on CUDA.

@tkupek
Copy link
Contributor Author

tkupek commented Dec 12, 2021

@Bunoviske pretty sure this one was fixed in a previous PR. If you install the lib from main branch the error above should show up.

@Borda
Copy link
Member

Borda commented Dec 12, 2021

pretty sure this one was fixed in a previous PR. If you install the lib from main branch the error above should show up.

you can simply install from future bugfix release as

pip install https://github.com/PyTorchLightning/metrics/archive/refs/heads/release/0.6.x.zip

@Borda
Copy link
Member

Borda commented Dec 13, 2021

@tkupek could you please add test for this case so we can make quick bug-fix release 🐰
cc: @SkafteNicki

@tkupek
Copy link
Contributor Author

tkupek commented Dec 13, 2021

Sorry, I don't know why the CUDA test is not working.

@mergify mergify bot added the ready label Dec 13, 2021
@twsl
Copy link
Contributor

twsl commented Dec 13, 2021

I think the issue is that in https://github.com/PyTorchLightning/metrics/blob/d071eb2f1245112c599b7fe0d165b8ac55663083/tests/helpers/testers.py#L168 the passed predictions don't satisfy the condition as a list of lists is passed. Therefor the tensors are never move to device.

@SkafteNicki
Copy link
Member

I think the tests should be fixed now. Had to change the tester function to move not only tensors but also collection of tensors to the right device.

@Borda Borda merged commit 07b5dc5 into Lightning-AI:master Dec 15, 2021
Borda pushed a commit that referenced this pull request Dec 15, 2021
* move and initialize tensors on the correct device (fix cuda)
* remove condition for moving tensors, its done in .to()
* fix gpu test
* docs

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: SkafteNicki <skaftenicki@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
(cherry picked from commit 07b5dc5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working Priority Critical task/issue ready topic: Image
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RuntimeErrors when attempting use of MAP-metric
7 participants