Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conv cudnn 3d #5783

Merged
merged 12 commits into from
Nov 27, 2017
Merged

Conv cudnn 3d #5783

merged 12 commits into from
Nov 27, 2017

Conversation

typhoonzero
Copy link
Contributor

@typhoonzero typhoonzero commented Nov 20, 2017

Fix #5784

int group_offset_out =
output_channels / groups * output_height * output_width;
output_channels / groups * output_height * output_width * output_depth;
int group_offset_filter = filter->numel() / groups;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's simpler to write this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to http://www.cplusplus.com/reference/vector/vector/erase/

Because vectors use an array as their underlying storage, erasing elements in positions other than the vector end causes the container to relocate all the elements after the segment erased to their new positions.

Erasing first two elements will cause memory re-allocation, which is not efficient.

int group_offset_out =
output_channels / groups * output_height * output_width;
output_channels / groups * output_height * output_width * output_depth;
int group_offset_filter = filter->numel() / groups;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group is supported in cudnn7.0 .

cudnnConvolutionDescriptor_t cudnn_conv_desc =
conv_desc.descriptor<T>(paddings, strides, dilations);

#if CUDNN_VERSION > 6000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#if CUDNN_VERSION > 6000 - > #if CUDNN_VERSION >= 7000 or #if CUDNN_VERSION_MIN(7,0,0)

This place needs to be changed too.

@@ -155,19 +200,34 @@ class CudnnConvGradOpKernel : public framework::OpKernel<T> {
cudnnTensorDescriptor_t cudnn_input_grad_desc = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudnn_input_grad_desc and cudnn_input_desc are the same, you can replace cudnn_input_grad_desc with cudnn_input_desc. Just like this.

Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

@typhoonzero typhoonzero merged commit a06bec1 into PaddlePaddle:develop Nov 27, 2017
@typhoonzero typhoonzero deleted the conv_cudnn_3d branch December 22, 2017 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants