Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check cuDF lazily #8084

Merged
merged 1 commit into from
Jul 17, 2022
Merged

Check cuDF lazily #8084

merged 1 commit into from
Jul 17, 2022

Conversation

grafail
Copy link
Contributor

@grafail grafail commented Jul 15, 2022

In some cases the "import cudf" can raise a RuntimeError or other errors while loading. Doing a lazy check is safer for these cases. Similiar to #7752.

Example of RuntimeError that could be generated by import:

Traceback (most recent call last):
  File "cuda/_cuda/ccuda.pyx", line 3671, in cuda._cuda.ccuda._cuInit
  File "cuda/_cuda/ccuda.pyx", line 435, in cuda._cuda.ccuda.cuPythonInit
RuntimeError: Failed to dlopen libcuda.so

@bdice
Copy link

bdice commented Jul 15, 2022

Hi @grafail, would you be able to share a full traceback, cudf version, and also the version of cuda-python? I’d like to dig into this further. This is being raised by cuda-python and we try to guard against this RuntimeError in cudf but might be missing a case. Thanks!

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! It would be great if you can help share a reproducible example with cuDF developers.

@trivialfis trivialfis merged commit 579ab23 into dmlc:master Jul 17, 2022
@grafail
Copy link
Contributor Author

grafail commented Jul 18, 2022

Hi @bdice. This issue can by recreated by just importing cudf on rapidsai/rapidsai-core:22.06-cuda11.2-base-ubuntu18.04-py3.8, while not using the nvidia runtime in docker.

docker run --rm -it rapidsai/rapidsai-core:22.06-cuda11.2-base-ubuntu18.04-py3.8 python3 -c "import cudf"

cuDF version: 22.6.0
cuda-python version: 11.7.0

This is the entire traceback:

Traceback (most recent call last):
  File "cuda/_cuda/ccuda.pyx", line 3671, in cuda._cuda.ccuda._cuInit
  File "cuda/_cuda/ccuda.pyx", line 435, in cuda._cuda.ccuda.cuPythonInit
RuntimeError: Failed to dlopen libcuda.so
Exception ignored in: 'cuda._lib.ccudart.utils.cudaPythonGlobal.lazyInitGlobal'
Traceback (most recent call last):
  File "cuda/_cuda/ccuda.pyx", line 3671, in cuda._cuda.ccuda._cuInit
  File "cuda/_cuda/ccuda.pyx", line 435, in cuda._cuda.ccuda.cuPythonInit
RuntimeError: Failed to dlopen libcuda.so
Segmentation fault (core dumped)

It seems to not create an issue on rapidsai/rapidsai-core:21.10-cuda11.2-base-ubuntu18.04-py3.8 though.

@bdice
Copy link

bdice commented Jul 18, 2022

@grafail Thanks for the additional information!

@trivialfis trivialfis mentioned this pull request Aug 11, 2022
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants