CUDA runtime error - open3D v0.14.1 #4679

BBO-repo · 2022-02-01T14:57:23Z

Checklist

I have searched for similar issues.
For Python issues, I have tested with the latest development wheel.
I have checked the release documentation and the latest documentation (for master branch).

Describe the issue

With the following configuration:

Ubuntu "18.04.6 LTS (Bionic Beaver)"
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
nvcc version: Cuda compilation tools, release 11.6, V11.6.55
cmake version 3.22.2

I got a runtime error, running the DenseSlam.cpp when I set "--device CUDA:0" but everything works fine when I use "--device CPU:0"

The error is the following
[Open3D INFO] Using device: CUDA:0 terminate called after throwing an instance of 'std::runtime_error' what(): [Open3D Error] (void open3d::core::__OPEN3D_CUDA_CHECK(cudaError_t, const char*, int)) /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/CUDAUtils.cpp:301: /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp:43 CUDA runtime error: operation not supported

I do not understand what is the issue, since testing my cuda install I've run the deviceQuery cuda application which outputs me the following
bin/x86_64/linux/release/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Quadro M1000M"
CUDA Driver Version / Runtime Version 11.6 / 11.6
....
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1 Result = PASS

Could you please provide any support to solve my issue?

Also joining my cmake file to build open3D

# Option 1: Use ExternalProject_Add, as shown in this CMake example.
# Option 2: Install Open3D first and use find_package, see
#           http://www.open3d.org/docs/release/cpp_project.html for details.
include(ExternalProject)
ExternalProject_Add(
    external_open3d
    PREFIX open3d
    GIT_REPOSITORY https://github.com/intel-isl/Open3D.git
    GIT_TAG v0.14.1
    GIT_SHALLOW ON
    UPDATE_COMMAND ""
    # Check out https://github.com/intel-isl/Open3D/blob/master/CMakeLists.txt
    # For the full list of available options.
    CMAKE_ARGS
        -DCMAKE_INSTALL_PREFIX=<INSTALL_DIR>
        -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
        -DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}
        -DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}
        -DGLIBCXX_USE_CXX11_ABI=${GLIBCXX_USE_CXX11_ABI}
        -DSTATIC_WINDOWS_RUNTIME=${STATIC_WINDOWS_RUNTIME}
        -DBUILD_SHARED_LIBS=ON
        -DBUILD_PYTHON_MODULE=OFF
        -DBUILD_EXAMPLES=OFF
        -DBUILD_WEBRTC=OFF
        -DBUILD_CUDA_MODULE=ON
)

Steps to reproduce the bug

In a ubuntu 18.04.6 linux distribution with a machine supporting cuda
Install cuda 11.6
Use open3D external cmake add external 
Run the example DenseSLAM with flag "--device CUDA:0"

Error message

[Open3D INFO] Using device: CUDA:0 terminate called after throwing an instance of 'std::runtime_error' what(): [Open3D Error] (void open3d::core::__OPEN3D_CUDA_CHECK(cudaError_t, const char*, int)) /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/CUDAUtils.cpp:301: /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp:43 CUDA runtime error: operation not supported

Expected behavior

Running DenseSLAM without crashing it is the case when running with the flag "--device CPU:0"

Open3D, Python and System information

- Operating system: Ubuntu 18.04.6
- Open3D version: 0.14.1
- System type: 64 bit machine
- Is this remote workstation?: no
- How did you install Open3D?: build from source
- Compiler version (if built from source): gcc 7.5

Additional information

The text was updated successfully, but these errors were encountered:

theNded · 2022-02-01T15:53:11Z

Quadro M1000M is an old card and I suspect cudaMallocAsync is not supported these machines, see JuliaGPU/CUDA.jl#637
(@yxlao we may want to add this checker in addition to the CUDART version macro).

One potential fix is to replace all the functions with Async postfix with their non-async versions.

BBO-repo · 2022-02-01T20:51:45Z

Hi @theNded
Thank you for you fast answer.
You were right the Async was making the issue.
I've changed in the file open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp, by commenting the lines 41 to 44 to disable the cudaMallocAsync and the lines 58 to 62 to disable the cudaFreeAsync

The denseSlam is now running until a point where I do face another error: an out of memory error

[Open3D INFO] Processing 925/2407...
[Open3D INFO] Processing 926/2407...
terminate called after throwing an instance of 'std::runtime_error'
  what():  [Open3D Error] (void open3d::core::__OPEN3D_CUDA_CHECK(cudaError_t, const char*, int)) /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/CUDAUtils.cpp:301: /home/ubuntu/Work/Projects/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp:45 CUDA runtime error: out of memory```

It always failed to this 926th RGB-D image.
Do you have any idea to solve this?

theNded · 2022-02-01T22:40:37Z

Quadro M1000M only has 4G GPU memory, so there are not many things we can do with it. One potential change is to increase the voxel size by a factor of say 2, but it will sacrifice the tracking and reconstruction quality.

BBO-repo · 2022-02-02T09:42:31Z

Ok then it is all solved!
Thank you.

ao2 · 2023-08-22T07:41:23Z

Hi,

@theNded I would like to re-open this issue as there are new findings about it.

To recap the CUDA runtime error: operation not supported error referred to the fact that the cudaMallocAsync() function does not work on some GPUs, namely Quadro M1000M and Quadro M3000M (the one I have).

Digging in the CUDA documentation we can find out that the Stream Ordered Memory Allocator is not available on all NVIDIA GPUs and that support should be verified at runtime, see https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__POOLS.html

In practical terms this would mean that the compile-time check on the driver version performed in

Open3D/cpp/open3d/core/MemoryManagerCUDA.cpp

Lines 22 to 27 in 59792c2

    
           #if CUDART_VERSION >= 11020 
        
                   OPEN3D_CUDA_CHECK(cudaMallocAsync(static_cast<void**>(&ptr), byte_size, 
        
                                                     cuda::GetStream())); 
        
           #else 
        
                   OPEN3D_CUDA_CHECK(cudaMalloc(static_cast<void**>(&ptr), byte_size)); 
        
           #endif

should be replaced with something like:

    if (device.cudaSupportsMemoryPools()) {
         OPEN3D_CUDA_CHECK(cudaMallocAsync(static_cast<void**>(&ptr), byte_size,
                                          cuda::GetStream()));
    } else {
        OPEN3D_CUDA_CHECK(cudaMalloc(static_cast<void**>(&ptr), byte_size));
    }

and the implementation of device.cudaSupportsMemoryPools() could be something like this:

    int driverVersion = 0;
    int deviceSupportsMemoryPools = 0;
    OPEN3D_CUDA_CHECK(cudaDriverGetVersion(&driverVersion));
    if (driverVersion >= 11020) { // avoid invalid value error in cudaDeviceGetAttribute
        OPEN3D_CUDA_CHECK(cudaDeviceGetAttribute(&deviceSupportsMemoryPools, cudaDevAttrMemoryPoolsSupported, device));
    }

    return !!deviceSupportsMemoryPools;

I'll try to propose a patch for this, but if someone more familiar with the Open3D codebase wants to anticipate me, please go ahead.

Thank you, Antonio

… time (isl-org#4679) Some CUDA GPUs, like the Quadro M3000M don't support Memory Pools operations like cudaMallocAsync/cudaFreeAsync even on driver versions newer than 11020, and this can result in errors like: CUDA runtime error: operation not supported So check for support at runtime instead of compile time.

Some CUDA GPUs, like the Quadro M3000M don't support Memory Pools operations like cudaMallocAsync/cudaFreeAsync even on driver versions newer than 11020, and this can result in errors like: CUDA runtime error: operation not supported So check for support at runtime instead of compile time.

ao2 · 2023-10-21T09:27:36Z

Pushed a tentative fix to #6440

Some CUDA GPUs, like the Quadro M3000M don't support Memory Pools operations like cudaMallocAsync/cudaFreeAsync even on driver versions newer than 11020, and this can result in errors like: CUDA runtime error: operation not supported So check for support at runtime instead of compile time.

Some CUDA GPUs, like the Quadro M3000M don't support Memory Pools operations like cudaMallocAsync/cudaFreeAsync even on driver versions newer than 11.2, and this can result in errors like: CUDA runtime error: operation not supported So check for support at runtime instead of compile time. Still keep the compile time check to support building with CUDA versions older than 11.2.

BBO-repo added the bug Not a build issue, this is likely a bug. label Feb 1, 2022

theNded added build/install Build or installation issue cuda and removed bug Not a build issue, this is likely a bug. labels Feb 1, 2022

theNded closed this as completed Feb 1, 2022

ao2 mentioned this issue Oct 21, 2023

Check for support of CUDA Memory Pools at runtime (#4679) #6440

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA runtime error - open3D v0.14.1 #4679

CUDA runtime error - open3D v0.14.1 #4679

BBO-repo commented Feb 1, 2022

theNded commented Feb 1, 2022

BBO-repo commented Feb 1, 2022 •

edited

Loading

theNded commented Feb 1, 2022

BBO-repo commented Feb 2, 2022

ao2 commented Aug 22, 2023

ao2 commented Oct 21, 2023

CUDA runtime error - open3D v0.14.1 #4679

CUDA runtime error - open3D v0.14.1 #4679

Comments

BBO-repo commented Feb 1, 2022

Checklist

Describe the issue

Steps to reproduce the bug

Error message

Expected behavior

Open3D, Python and System information

Additional information

theNded commented Feb 1, 2022

BBO-repo commented Feb 1, 2022 • edited Loading

theNded commented Feb 1, 2022

BBO-repo commented Feb 2, 2022

ao2 commented Aug 22, 2023

ao2 commented Oct 21, 2023

BBO-repo commented Feb 1, 2022 •

edited

Loading