-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime eror before my code return #2804
Comments
Given this: It seems like your program is exiting and you still have an instance of your MNIST class, which has a onnxruntime::InferenceSession within it. During the cleanup of that InferenceSession it's attempting to cleanup the CUDA side of things, but the error seems to indicate that the CUDA driver is already shutting down (possibly due to some other call in __run_exit_handlers). Can you try explicitly freeing your MNIST instance prior to main() returning? |
@yueyihua Were you able to resolve this issue? |
Closing due to inactivity. please reopen as needed. |
Describe the bug
Runtime eror betwen inference and code return:
terminate called after throwing an instance of 'onnxruntime::OnnxRuntimeException'
what(): /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/providers/cuda/cuda_call.cc:91 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 4: driver shutting down ; GPU=32767 ; hostname=greenet ; expr=cudaEventSynchronize(e);
Stacktrace:
Stacktrace:
Program received signal SIGABRT, Aborted.
0x00007fffec35b2c7 in raise () from /usr/lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 cairo-1.15.12-3.el7.x86_64 expat-2.1.0-8.el7.x86_64 ffmpeg-libs-3.4.6-1.el7.x86_64 fontconfig-2.13.0-4.3.el7.x86_64 freetype-2.8-12.el7_6.1.x86_64 fribidi-1.0.2-1.el7.x86_64 gdk-pixbuf2-2.36.12-3.el7.x86_64 gflags-2.1.1-6.el7.x86_64 glib2-2.56.1-4.el7_6.x86_64 glibc-2.17-260.el7_6.6.x86_64 glog-0.3.3-8.el7.x86_64 gmp-6.0.0-11.el7.x86_64 gnutls-3.3.8-12.el7.x86_64 graphite2-1.3.10-1.el7_3.x86_64 gsm-1.0.13-11.el7.x86_64 harfbuzz-1.7.5-2.el7.x86_64 hdf5-1.8.12-11.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_6.x86_64 lame-libs-3.100-1.el7.x86_64 leveldb-1.12.0-11.el7.x86_64 libX11-1.6.5-2.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXext-1.3.3-3.el7.x86_64 libXfixes-5.0.3-1.el7.x86_64 libXrender-0.9.10-1.el7.x86_64 libaec-1.0.4-1.el7.x86_64 libblkid-2.23.2-21.el7.x86_64 libbluray-0.2.3-5.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libcroco-0.6.8-5.el7.x86_64 libdrm-2.4.91-3.el7.x86_64 libffi-3.0.13-11.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libgcrypt-1.5.3-12.el7.x86_64 libgfortran-4.8.5-39.el7.x86_64 libgomp-4.8.5-39.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libmfx-1.21-2.el7.x86_64 libmount-2.23.2-21.el7.x86_64 libogg-1.3.0-7.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 libquadmath-4.8.5-39.el7.x86_64 librsvg2-2.40.20-1.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libtasn1-3.8-2.el7.x86_64 libthai-0.1.14-9.el7.x86_64 libtheora-1.1.1-8.el7.x86_64 libuuid-2.23.2-21.el7.x86_64 libva-1.8.3-1.el7.x86_64 libvdpau-1.1.1-3.el7.x86_64 libvorbis-1.3.3-8.el7.1.x86_64 libxcb-1.13-1.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 lmdb-libs-0.9.22-2.el7.x86_64 nettle-2.7.1-4.el7.x86_64 numactl-libs-2.0.12-3.el7_7.1.x86_64 openblas-serial-0.3.3-2.el7.x86_64 opencore-amr-0.1.5-6.el7.x86_64 openjpeg2-2.3.1-1.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 opus-1.0.2-6.el7.x86_64 p11-kit-0.20.7-3.el7.x86_64 pango-1.42.4-2.el7_6.x86_64 pcre-8.32-14.el7.x86_64 pixman-0.34.0-1.el7.x86_64 snappy-1.1.0-3.el7.x86_64 soxr-0.1.2-1.el7.x86_64 speex-1.2-0.19.rc1.el7.x86_64 trousers-0.3.11.2-3.el7.x86_64 vo-amrwbenc-0.1.3-1.el7.x86_64 x264-libs-0.148-23.20170521gitaaa9aa8.el7.x86_64 x265-libs-2.9-3.el7.x86_64 xvidcore-1.3.4-2.el7.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-18.el7.x86_64 zvbi-0.2.35-1.el7.x86_64
(gdb) bt
#0 0x00007fffec35b2c7 in raise () from /usr/lib64/libc.so.6
#1 0x00007fffec35c9b8 in abort () from /usr/lib64/libc.so.6
#2 0x00007fffecc9a2cd in __gnu_cxx::__verbose_terminate_handler () at /home/eaverin/Downloads/gcc-build-dev/gcc-6.2.0/libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00007fffecc982a6 in __cxxabiv1::__terminate (handler=) at /home/eaverin/Downloads/gcc-build-dev/gcc-6.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x00007fffecc972d9 in __cxa_call_terminate (ue_header=ue_header@entry=0xbd9ad0) at /home/eaverin/Downloads/gcc-build-dev/gcc-6.2.0/libstdc++-v3/libsupc++/eh_call.cc:54
#5 0x00007fffecc97c2d in __cxxabiv1::__gxx_personality_v0 (version=, actions=, exception_class=5138137972254386944, ue_header=, context=0x7fffffffd3a0)
at /home/eaverin/Downloads/gcc-build-dev/gcc-6.2.0/libstdc++-v3/libsupc++/eh_personality.cc:676
#6 0x00007fffec7018a3 in ?? () from /usr/lib64/libgcc_s.so.1
#7 0x00007fffec701dd7 in _Unwind_Resume () from /usr/lib64/libgcc_s.so.1
#8 0x00007fffed4e79a3 in onnxruntime::CudaCall<cudaError, true> (retCode=cudaErrorCudartUnloading, exprString=0x7fffedc6f089 "cudaEventSynchronize(e)", libName=0x7fffedc6efc7 "CUDA",
successCode=cudaSuccess, msg=0x7fffedc6efa3 "") at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/providers/cuda/cuda_call.cc:95
#9 0x00007fffed082e26 in onnxruntime::CUDAExecutionProvider::~CUDAExecutionProvider (this=0xc69ab0, __in_chrg=)
at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/providers/cuda/cuda_execution_provider.cc:100
#10 0x00007fffed082fd8 in onnxruntime::CUDAExecutionProvider::~CUDAExecutionProvider (this=0xc69ab0, __in_chrg=)
at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/providers/cuda/cuda_execution_provider.cc:107
#11 0x00007fffecffcf80 in std::default_deleteonnxruntime::IExecutionProvider::operator() (this=0x12d8600, __ptr=0xc69ab0) at /usr/include/c++/4.8.2/bits/unique_ptr.h:67
#12 0x00007fffecff5f09 in std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider >::~unique_ptr (this=0x12d8600, __in_chrg=)
at /usr/include/c++/4.8.2/bits/unique_ptr.h:184
#13 0x00007fffed00fa5a in std::_Destroy<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider > > (__pointer=0x12d8600)
at /usr/include/c++/4.8.2/bits/stl_construct.h:93
#14 0x00007fffed00a552 in std::_Destroy_aux::__destroy<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider >> (__first=0x12d8600,
__last=0x12d8610) at /usr/include/c++/4.8.2/bits/stl_construct.h:103
#15 0x00007fffed005035 in std::_Destroy<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider >> (__first=0x12d8600, __last=0x12d8610)
at /usr/include/c++/4.8.2/bits/stl_construct.h:126
#16 0x00007fffecffceaf in std::_Destroy<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider >, std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider > > (__first=0x12d8600, __last=0x12d8610) at /usr/include/c++/4.8.2/bits/stl_construct.h:151
#17 0x00007fffecff5d75 in std::vector<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider >, std::allocator<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_deleteonnxruntime::IExecutionProvider > > >::~vector (this=0xc4fad0, __in_chrg=) at /usr/include/c++/4.8.2/bits/stl_vector.h:415
#18 0x00007fffed03c20a in onnxruntime::ExecutionProviders::~ExecutionProviders (this=0xc4fad0, __in_chrg=)
at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/framework/execution_providers.h:21
#19 0x00007fffed03d79e in onnxruntime::InferenceSession::~InferenceSession (this=0xc4f780, __in_chrg=)
at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/session/inference_session.cc:268
#20 0x00007fffed03da4e in onnxruntime::InferenceSession::~InferenceSession (this=0xc4f780, __in_chrg=)
at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/session/inference_session.cc:286
#21 0x00007fffecff2e00 in OrtApis::ReleaseSession (value=0xc4f780) at /home/yyh/3rdparty/onnxruntime-bak/onnxruntime/core/session/onnxruntime_c_api.cc:1476
#22 0x00000000004051c4 in Ort::OrtRelease(OrtSession) ()
#23 0x0000000000405d3d in Ort::Base::~Base() ()
#24 0x000000000040578e in Ort::Session::~Session() ()
#25 0x00000000004067b0 in MNIST::~MNIST() ()
#26 0x00000000004067e6 in std::default_delete::operator()(MNIST*) const ()
#27 0x00000000004062bd in std::unique_ptr<MNIST, std::default_delete >::~unique_ptr() ()
#28 0x00007fffec35ec29 in __run_exit_handlers () from /usr/lib64/libc.so.6
#29 0x00007fffec35ec77 in exit () from /usr/lib64/libc.so.6
#30 0x00007fffec34749c in __libc_start_main () from /usr/lib64/libc.so.6
---Type to continue, or q to quit---
#31 0x00000000004021d9 in _start ()
My code:
"""
// 3. Fill input data with cv mat and inference the result
float* input = mnist_->input_image_.data();
for (int i = 0; i < pred_total; ++i)
{
std::fill(mnist_->input_image_.begin(), mnist_->input_image_.end(), 0.f);
float *one_imgs_addr = &imgs.at(i, 0);
memcpy(input, one_imgs_addr, imgs.cols * sizeof(float));
int64_t y_pred = mnist_->Run();
int64_t y_real = labs.at<uint8_t>(i, 0);
if (y_pred == y_real)
acce_total++;
}
"""
Urgency
If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.
System information
To Reproduce
Describe steps/code to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.
The text was updated successfully, but these errors were encountered: