Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI after torchmetrics update #564

Closed
wants to merge 1 commit into from

Conversation

danthe3rd
Copy link
Contributor

@danthe3rd danthe3rd commented Dec 8, 2022

Stack from ghstack (oldest at bottom):

It now takes an argument: https://torchmetrics.readthedocs.io/en/stable/classification/accuracy.html
Somehow this is failing with a SEGFAULT on my A100 (in a triton kernel):

#0  0x00007fffc0f62e10 in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#1  0x00007fffc0f9303c in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#2  0x00007fffc0f2ea13 in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#3  0x00007fffc0f94603 in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#4  0x00007fffc119e4a0 in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#5  0x00007fffc0f3728f in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#6  0x00007fffc0f3999f in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#7  0x00007fffc0fdb1c2 in ?? () from /lib/x86_64-linux-gnu/libcuda.so
#8  0x00007fff502234c0 in _launch ()
   from /data/home/XXXXX/.triton/cache/704a3e6949e60326bc68d18a620bee50/layer_norm_fw.so
#9  0x00007fff3c0eea25 in launch ()
   from /data/home/XXXXX/.triton/cache/2cebb5590a024a2e06fe9de08c6b7079/k_dropout_bw.so
#10 0x0000555555698422 in cfunction_call (func=0x7fff3c6e5760, args=<optimized out>, kwargs=<optimized out>)
    at /usr/local/src/conda/python-3.10.6/Objects/methodobject.c:552

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 8, 2022
@danthe3rd danthe3rd requested a review from blefaudeux December 8, 2022 08:50
@danthe3rd danthe3rd closed this Dec 8, 2022
@facebook-github-bot facebook-github-bot deleted the gh/danthe3rd/61/head branch January 7, 2023 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants