Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt warpctc grad op for gradient checking #7414

Merged
merged 7 commits into from
Jan 15, 2018

Conversation

wanghaoshuang
Copy link
Contributor

  1. Fix warpctc grad op
  2. Add check grad test
  3. Add sequence scale functor

@wanghaoshuang wanghaoshuang changed the title Adapt warpctc grad op for gradient check Adapt warpctc grad op for gradient checking Jan 10, 2018
public:
void operator()(const platform::CPUDeviceContext& context,
framework::LoDTensor& seq, const T* scales,
const size_t num_seq) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is no need to pass num_seq, it can be got from framework::LoDTensor& seq.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx. Fixed.

T* seq_data = seq.mutable_data<T>(context.GetPlace());

int threads = 1024;
int grid = (seq.numel() * seq_width + threads - 1) / threads;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the CUDA kernel, we can use one block to process one sample.

int threads = 1024;
int grid = abs_offset_lod[level].size();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx. Fixed.

}
};

template class ScaleLoDTensorFunctor<platform::CUDADeviceContext, float>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double type is also needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, need to register double type for paddle/operators/warpctc_op.cc, please help to enhance it.

Copy link
Contributor Author

@wanghaoshuang wanghaoshuang Jan 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that API in warp-CTC doesn't support for double.

@@ -191,10 +191,10 @@ def setUp(self):
def test_check_output(self):
self.check_output()

def test_check_grad(self):
self.outputs['WarpCTCGrad'] = self.gradient
self.check_grad(["Logits"], "Loss", max_relative_error=0.01)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the gradient check pass when max_relative_error is lower?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will fail while max_relative_error = 0.005.

Copy link
Contributor Author

@wanghaoshuang wanghaoshuang Jan 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed max_relative_error from 0.01 to 0.007.

2. Remove num_seq arguments.
3. Refine CUDA kernel of ScaleLoDTensorFunctor.
4. Change max_relative_error of gradient unitest to 0.007
1. Fix kernel
2. Add more test case
@wanghaoshuang wanghaoshuang merged commit 448fee3 into PaddlePaddle:develop Jan 15, 2018
@wanghaoshuang wanghaoshuang deleted the warpctc branch January 15, 2018 15:26
@wanghaoshuang wanghaoshuang self-assigned this Jan 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants