-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stride pooling for seqlastin and seqfirstin #1701
Conversation
考虑到average和max的计算单元接口如下,都使用IVectorPtr来传递第二个参数:
因此,将 |
@@ -67,6 +66,14 @@ void SequencePoolLayer::forward(PassType passType) { | |||
<< "when trans_type = seq, input must hasSubseq"; | |||
output_.degradeSequence(input); | |||
} | |||
if (stride_ > 0) { | |||
CHECK_EQ(input.hasSubseq(), 0UL) | |||
<< "sequence stride pooling is not suitable for hasSubseq now"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is not suitable --> is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -2495,10 +2496,11 @@ def __init__(self, | |||
**xargs) | |||
config_assert( | |||
len(inputs) == 1, 'SequenceLastInstanceLayer must have 1 input') | |||
if trans_type == 'seq': | |||
config_assert(stride == -1, 'subseq do not support stride window') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not --> does not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/parameter/Argument.cpp
Outdated
// empty sequence | ||
tgtBuf[seqId + 1] = tgtBuf[seqId]; | ||
} else if (seqLength < stride) { | ||
tgtBuf[seqId + 1] = tgtBuf[seqId] + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以不需要 else if (seqLength < stride) 这个条件分枝,计算逻辑只在 seqLength % stride 时,有差别。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已去掉该分支。
paddle/parameter/Argument.cpp
Outdated
tgtBuf[seqId + 1] = tgtBuf[seqId] + 1; | ||
} else { | ||
tgtBuf[seqId + 1] = tgtBuf[seqId] + ceil((float)seqLength / stride); | ||
int size = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 590 ~ 592 行代码可以简化,先计算size,tgtBuf[seqId + 1] = tgtBuf[seqId] + size
size_t size = (seqLength % stride) ? seqLength / stride : seqLength / stride + 1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已简化
paddle/parameter/Argument.cpp
Outdated
* If input.sequenceStartPositions = [0, 9, 14, 17, 30] and stride = 5, | ||
* then sequenceStartPositions = [0, 2, 3, 4, 7], | ||
* and stridePostions = [0, 5, 9, 14, 17, 22, 27, 30] | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- stridePostions 的计算少了一个 “输入序列是逆序”这个分枝。
- stridePostions 在序列正序和逆序时的计算结果不一样
上面的例子中:
stridePostions = [0, 5, 9, 14, 17, 22, 27, 30] # 正序
stridePostions = [0, 4, 9, 14, 17, 20, 25, 30] # 逆序
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@lcy-seso 已根据要求进行修改 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost LGTM. only need some small changes.
paddle/parameter/Argument.cpp
Outdated
*/ | ||
CHECK(input.sequenceStartPositions); | ||
CHECK_EQ(input.hasSubseq(), 0UL); | ||
CHECK_GT(stride, 0) << "stride must larger than 0"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 调用 poolSequenceWithStride 这个函数时,外围的逻辑分枝已经是 stride > 0,没有subSeq,所以不再需要 573 ~ 574 这两行的检查。
- sequenceStartPositions 在调用这个函数之前已经使用(访问)过,572 行已经不再需要检查。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
虽然这次外围的逻辑分支都已经检查过了,但以后调用时可能会出现外围没有做check的情况,因此想保留这些check。
paddle/parameter/Argument.cpp
Outdated
} else { | ||
int size = ceil((float)seqLength / stride); | ||
tgtBuf[seqId + 1] = tgtBuf[seqId] + size; | ||
for (int i = 0; i < size - 1; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 循环 中的循环变量i 用 ++i
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/parameter/Argument.h
Outdated
After pooling with stride n (n is smaller than sequence length), | ||
a long sequence will be shorten. | ||
This function is not suitable for sequence with sub-sequence now. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not suitable for sequence with sub-sequence now. --> This function is invalid for sequence having sub-sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layer_attr=None): | ||
""" | ||
Get Last Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get last timestamp upon a stride window of sequence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的注释可以修改一下,第一句话变成下面这样。
If stride > 0, this layer slides a window whose size is determined by stride, and return the first value of the window as the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layer_attr=None): | ||
""" | ||
Get Last Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get last timestamp upon a stride window of sequence. | ||
And a long sequence will be shorten. Note that for sequence with | ||
sub-sequence, stride is default -1 now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stride is default -1 now. --> the default value of stride is -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layer_attr=None): | ||
""" | ||
Get First Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get first timestamp upon a stride window of sequence, | ||
and a long sequence will be shorten. Note that for sequence with | ||
sub-sequence, stride is default -1 now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stride is default -1 now. --> the default value of stride is -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
int stride_; | ||
// store the start position of each stride window | ||
IVectorPtr stridePositions_; | ||
// Whether it is reversed sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether the input sequence is reversed or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -42,6 +46,11 @@ class SequencePoolLayer : public Layer { | |||
enum SequenceLevel { kNonSeq = 0, kSeq = 1 }; | |||
size_t newBatchSize_; | |||
ICpuGpuVectorPtr startPositions_; | |||
int stride_; | |||
// store the start position of each stride window |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store the start position of each window
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -26,6 +26,10 @@ namespace paddle { | |||
* Output: output size is the number of input sequences (NOT input instances) | |||
* output[i] = seqlastin/average/max_{for each instance in this | |||
* sequence}{input[i]} | |||
* If stride_ > 0: | |||
* Check input sequence must don't have sub-sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check input sequence must not have sub-sequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -25,6 +25,9 @@ namespace paddle { | |||
* Input: a sequence | |||
* If SequenceLevel = kNonseq: | |||
* Output: a sequence containing only the last instance of the input sequence | |||
* If stride_ > 0: | |||
* Output: a shorten sequence containing several last instances of the | |||
* input sequence with stride window. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output: a shorten sequence. The operation of getting last instance of a sequence is independently performed on every slice of the input sequence, that is obtained by sliding a window with the window size set to stride_.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已按要求修改
@@ -25,6 +25,9 @@ namespace paddle { | |||
* Input: a sequence | |||
* If SequenceLevel = kNonseq: | |||
* Output: a sequence containing only the last instance of the input sequence | |||
* If stride_ > 0: | |||
* Output: a shorten sequence containing several last instances of the | |||
* input sequence with stride window. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -26,6 +26,10 @@ namespace paddle { | |||
* Output: output size is the number of input sequences (NOT input instances) | |||
* output[i] = seqlastin/average/max_{for each instance in this | |||
* sequence}{input[i]} | |||
* If stride_ > 0: | |||
* Check input sequence must don't have sub-sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -42,6 +46,11 @@ class SequencePoolLayer : public Layer { | |||
enum SequenceLevel { kNonSeq = 0, kSeq = 1 }; | |||
size_t newBatchSize_; | |||
ICpuGpuVectorPtr startPositions_; | |||
int stride_; | |||
// store the start position of each stride window |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
int stride_; | ||
// store the start position of each stride window | ||
IVectorPtr stridePositions_; | ||
// Whether it is reversed sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/parameter/Argument.cpp
Outdated
*/ | ||
CHECK(input.sequenceStartPositions); | ||
CHECK_EQ(input.hasSubseq(), 0UL); | ||
CHECK_GT(stride, 0) << "stride must larger than 0"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
虽然这次外围的逻辑分支都已经检查过了,但以后调用时可能会出现外围没有做check的情况,因此想保留这些check。
layer_attr=None): | ||
""" | ||
Get Last Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get last timestamp upon a stride window of sequence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layer_attr=None): | ||
""" | ||
Get Last Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get last timestamp upon a stride window of sequence. | ||
And a long sequence will be shorten. Note that for sequence with | ||
sub-sequence, stride is default -1 now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1357,6 +1362,8 @@ def last_seq(input, | |||
:type name: basestring | |||
:param input: Input layer name. | |||
:type input: LayerOutput | |||
:param stride: parameter of stride window. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layer_attr=None): | ||
""" | ||
Get First Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get first timestamp upon a stride window of sequence, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layer_attr=None): | ||
""" | ||
Get First Timestamp Activation of a sequence. | ||
|
||
If stride > 0, get first timestamp upon a stride window of sequence, | ||
and a long sequence will be shorten. Note that for sequence with | ||
sub-sequence, stride is default -1 now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -37,6 +42,7 @@ class SequenceLastInstanceLayer : public SequencePoolLayer { | |||
protected: | |||
MatrixPtr tmpSrc_; | |||
MatrixPtr tmpDest_; | |||
std::vector<int> insId_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insId_
==> instanceIds_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -42,6 +46,11 @@ class SequencePoolLayer : public Layer { | |||
enum SequenceLevel { kNonSeq = 0, kSeq = 1 }; | |||
size_t newBatchSize_; | |||
ICpuGpuVectorPtr startPositions_; | |||
int stride_; | |||
// store the start position of each window |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store ... window
==> Store ... window.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
int stride_; | ||
// store the start position of each window | ||
IVectorPtr stridePositions_; | ||
// Whether the input sequence is reversed or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not
==> not .
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
void testDegradeLayer(bool hasSubseq, | ||
string layer_type, | ||
string trans_type, | ||
int stride = -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default argument value 是可以用的,但是一般不用,原因在这里: https://google.github.io/styleguide/cppguide.html#Default_Arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/parameter/Argument.cpp
Outdated
size_t stride, | ||
IVectorPtr* stridePostions, | ||
bool reversed) { | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在.cpp文件里,用C++ style comment //
更能统一风格。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
part one of #1608:
Average和Max的计算过程比SequenceLastInstance略复杂,会在下一个PR中进行完善。