-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt warpctc grad op for gradient checking #7414
Conversation
wanghaoshuang
commented
Jan 10, 2018
- Fix warpctc grad op
- Add check grad test
- Add sequence scale functor
2. Add check grad test
public: | ||
void operator()(const platform::CPUDeviceContext& context, | ||
framework::LoDTensor& seq, const T* scales, | ||
const size_t num_seq) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is no need to pass num_seq
, it can be got from framework::LoDTensor& seq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx. Fixed.
T* seq_data = seq.mutable_data<T>(context.GetPlace()); | ||
|
||
int threads = 1024; | ||
int grid = (seq.numel() * seq_width + threads - 1) / threads; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the CUDA kernel, we can use one block to process one sample.
int threads = 1024;
int grid = abs_offset_lod[level].size();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx. Fixed.
} | ||
}; | ||
|
||
template class ScaleLoDTensorFunctor<platform::CUDADeviceContext, float>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double type is also needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, need to register double type for paddle/operators/warpctc_op.cc
, please help to enhance it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that API in warp-CTC
doesn't support for double
.
@@ -191,10 +191,10 @@ def setUp(self): | |||
def test_check_output(self): | |||
self.check_output() | |||
|
|||
def test_check_grad(self): | |||
self.outputs['WarpCTCGrad'] = self.gradient | |||
self.check_grad(["Logits"], "Loss", max_relative_error=0.01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the gradient check pass when max_relative_error
is lower?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will fail while max_relative_error = 0.005
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed max_relative_error
from 0.01 to 0.007.
2. Remove num_seq arguments. 3. Refine CUDA kernel of ScaleLoDTensorFunctor. 4. Change max_relative_error of gradient unitest to 0.007
1. Fix kernel 2. Add more test case