Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tensorrt #9891

Merged
merged 15 commits into from
Apr 16, 2018
Merged

add tensorrt #9891

merged 15 commits into from
Apr 16, 2018

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Apr 13, 2018

fixs: #9921

This is a naive test for TensorRT library integration with Paddle.

CMakeLists.txt Outdated
@@ -39,6 +39,7 @@ option(WITH_GPU "Compile PaddlePaddle with NVIDIA GPU" ${CUDA_F
option(WITH_AMD_GPU "Compile PaddlePaddle with AMD GPU" OFF)
option(WITH_AVX "Compile PaddlePaddle with AVX intrinsics" ${AVX_FOUND})
option(WITH_MKL "Compile PaddlePaddle with MKL support." ${AVX_FOUND})
option(WITH_TENSORRT "Compile PaddlePaddle with TensorRT support." ON)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will turn off this latter. Changing TeamCity config should more cautious.

@Superjomn Superjomn requested review from Xreki and luotao1 April 15, 2018 11:23
@Xreki Xreki added the 预测 原名Inference,包含Capi预测问题等 label Apr 16, 2018
Dockerfile Outdated
@@ -45,6 +45,12 @@ ENV PATH=${PATH}:${GOROOT}/bin:${GOPATH}/bin
# install glide
RUN curl -s -q https://glide.sh/get | sh

# Install TensorRT
RUN wget -qO- http://paddlepaddledeps.bj.bcebos.com/TensorRT-4.0.0.3.Ubuntu-16.04.4.x86_64-gnu.cuda-8.0.cudnn7.0.tar.gz | \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 这里的tar.gz和官网下载的有所不同,只包含了include和lib包,目的是为了让包减少2/3的大小,从而节省下载时间。需要加comment说明一下。
  2. 根据这个包下载的,里面还有targets目录,该目录可以在打包的时候去掉。
TensorRT
├── include
├── lib
└── targets
  1. 这里使用的NvInfer.h,是对原来的版本做了一点修改的,不然会报错。可以写一个issue说明下报错情况,然后在这里加一个comment。

Dockerfile Outdated
@@ -57,8 +63,7 @@ RUN localedef -i en_US -f UTF-8 en_US.UTF-8
# specify sphinx version as 1.5.6 and remove -U option for [pip install -U
# sphinx-rtd-theme] since -U option will cause sphinx being updated to newest
# version(1.7.1 for now), which causes building documentation failed.
RUN pip install --upgrade pip && \
pip install -U wheel && \
RUN pip install -U wheel && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#9926 merge后,这里需要更新下。

@@ -1,6 +1,6 @@
cc_library(dynamic_loader SRCS dynamic_loader.cc DEPS glog gflags enforce)

list(APPEND CUDA_SRCS cublas.cc cudnn.cc curand.cc nccl.cc)
list(APPEND CUDA_SRCS cublas.cc cudnn.cc curand.cc nccl.cc tensorrt.cc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里需要加编译选项来选择是否添加tensorrt.cc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// Fix the dynload issue, the following two API are implemented in TensorRT's
// header file, cannot load from the dynamic library. So create our own
// implementation and directly trigger the method from the dynamic library.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

58-60的注释需要更新下:

  • fix the dynload issue: 请问issue在哪儿?
  • API-》APIs
  • but can not loaded from

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改了

CMakeLists.txt Outdated
@@ -179,6 +180,7 @@ set(EXTERNAL_LIBS

if(WITH_GPU)
include(cuda)
set(WITH_TENSORRT ON)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这句可以去掉。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Dockerfile Outdated
RUN wget -qO- http://paddlepaddledeps.bj.bcebos.com/TensorRT-4.0.0.3.Ubuntu-16.04.4.x86_64-gnu.cuda-8.0.cudnn7.0.tar.gz | \
tar -xz -C /usr/local && \
cp -rf /usr/local/TensorRT/include /usr/local && \
cp -rf /usr/local/TensorRT/lib /usr/local
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 缺少类似cudnn.cmake这样的tensorrt.cmake,用户无法用自定义路径的安装形式,可以之后的PR补充。
  2. 目前直接安装在/usr/local/include和/usr/local/lib里,应该像cudago一样,有一个自己的目录,可以和1一起之后的PR改进。
    image

Copy link
Contributor Author

@Superjomn Superjomn Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在直接复制到 /usr下了,可以后续pr改下

@luotao1
Copy link
Contributor

luotao1 commented Apr 16, 2018

编译成功,运行单测存在:

75: unknown file: Failure
75: C++ exception with description "Failed to find dynamic library: libnvinfer.so ( libnvinfer.so: cannot open shared object file: No such file or directory ) 
75:  Please specify its path correctly using following ways: 
75:  Method. set environment variable LD_LIBRARY_PATH on Linux or DYLD_LIBRARY_PATH on Mac OS. 
75:  For instance, issue command: export LD_LIBRARY_PATH=... 
75:  Note: After Mac OS 10.11, using the DYLD_LIBRARY_PATH is impossible unless System Integrity Protection (SIP) is disabled. at [/Paddle/paddle/fluid/platform/dynload/dynamic_loader.cc:133]
75: PaddlePaddle Call Stacks: 
75: 0             0x432bf9p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 761
75: 1             0x4b2367p
75: 2             0x4b3032p paddle::platform::dynload::GetTensorRtDsoHandle() + 98
75: 3             0x435371p void std::__once_call_impl<std::_Bind_simple<decltype (createInferBuilder_INTERNAL({parm#1}...)) paddle::platform::dynload::DynLoad__createInferBuilder_INTERNAL::operator()<nvinfer1::ILogger*, int>(nvinfer1::ILogger*, int)::{lambda()#1} ()> >() + 33
75: 4       0x7f0d873a1a99p
75: 5             0x42ce5fp createInferBuilder(nvinfer1::ILogger&) + 111
75: 6             0x42d028p CreateNetwork() + 72
75: 7             0x42f138p TensorrtTest_BasicFunction_Test::TestBody() + 40
75: 8             0x4d7da3p void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 67
75: 9             0x4cdb8ap testing::Test::Run() + 186
75: 10            0x4cdcd8p testing::TestInfo::Run() + 280
75: 11            0x4cdde5p testing::TestCase::Run() + 229
75: 12            0x4d0287p testing::internal::UnitTestImpl::RunAllTests() + 583
75: 13            0x4d05b9p testing::UnitTest::Run() + 89
75: 14            0x42c349p main + 329
75: 15      0x7f0d86645830p __libc_start_main + 240
75: 16            0x42c989p _start + 41
75: " thrown in the test body.
75: [  FAILED  ] TensorrtTest.BasicFunction (1 ms)
75: [----------] 1 test from TensorrtTest (1 ms total)
75: 
75: [----------] Global test environment tear-down
75: [==========] 1 test from 1 test case ran. (1 ms total)
75: [  PASSED  ] 0 tests.
75: [  FAILED  ] 1 test, listed below:
75: [  FAILED  ] TensorrtTest.BasicFunction
75: 
75:  1 FAILED TEST
1/1 Test #75: test_tensorrt ....................***Failed    5.52 sec

0% tests passed, 1 tests failed out of 1

/usr/local/lib下的相关so文件拷贝到/usr/lib下即可。

@@ -58,3 +58,11 @@ void GetWarpCTCDsoHandle(void** dso_handle);
*
*/
void GetLapackDsoHandle(void** dso_handle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DynamicLoader.h 可以不用修改,是老paddle用的。

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Superjomn Superjomn merged commit 1866597 into PaddlePaddle:develop Apr 16, 2018
@Superjomn Superjomn deleted the fea/add_tensorrt branch April 16, 2018 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants