-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable MKL Packed Recurrent Layer #6719
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
从代码实现看,MKLPackedRecurrentLayer为什么不能直接继承RecurrentLayer呢?
real* weightPacked_; | ||
real* weightTPacked_; | ||
size_t weightHeight_; | ||
size_t weightWidth_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
变量请加注释,weightTPacked_和weightPacked_有什么区别?前者是trans过的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我们会添加上必要的注释。
weightWidth_, | ||
1, | ||
batch2->getData(), | ||
weightWidth_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
能否简化:
cblas_sgemm_compute(CblasRowMajor,
CblasNoTrans,
CblasPacked,
batch2->getHeight(),
weightWidth_,
weightHeight_,
batch1->getData(),
weightHeight_,
transW? weightTPacked_ : weightPacked_ ,
weightWidth_,
1,
batch2->getData(),
weightWidth_);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thx.
1.0, | ||
weight->getData(), | ||
weightWidth_, | ||
weightTPacked_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
33-47行请简化:
auto pack= [](real* Packed) {
Pack = cblas_sgemm_alloc(CblasBMatrix, 1, weightWidth_, weightHeight_);
cblas_sgemm_pack();
};
pack(weightTPacked_);
pack(weightPacked_);
这里初始化的时候,不用根据trans值只初始化一个么?必须初始化两个?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我们会再优化。
另外优化后的逻辑更清晰点,可以只trans一个。
|
||
REGISTER_LAYER(mkl_packed_recurrent, MKLPackedRecurrentLayer); | ||
|
||
bool MKLPackedRecurrentLayer::init(const LayerMap& layerMap, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MKLPackedRecurrentLayer可以继承RecurrentLayer么,看这个cpp里的代码,和RecurrentLayer中的很多都是类似的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以的,我们会先把paddle原来的recurrent layer 提出一个头文件。
|
||
sgemm_packed_.reset(new MKLPackedGemm(weight_->getW())); | ||
|
||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool MKLPackedRecurrentLayer::init(const LayerMap& layerMap, const ParameterMap& parameterMap) {
RecurrentLayer::init(layerMap, parameterMap);
sgemm_packed_.reset(new MKLPackedGemm(weight_->getW()));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
CpuMatrix outputGrad(inputGrad->getHeight(), inputGrad->getWidth()); | ||
outputGrad.randomizeUniform(); | ||
|
||
for (int i = 0; i < 2; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
468-539行重复代码太多了,请简化。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,会改掉。thx。
@@ -420,12 +420,167 @@ TEST(Layer, LstmLayer) { | |||
} | |||
} | |||
|
|||
#ifdef PADDLE_WITH_MKLML | |||
|
|||
LayerPtr initMKLPackedLayer(LayerConfig layerConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么这个不能用initRecurrentLayer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,这里还可以在优化下。
LayerPtr dataLayer = | ||
creatDataLayer("layer_0", batchSize, layerSize, false); | ||
ParameterPtr para = | ||
creatParameter("para_0", 0, layerSize * layerSize, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataLayer和para的初始化是不是应该放进checkMKLPackedLayer呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,没问题x。
7e18122
to
290edd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 大量缺少注释
- 虽然MKLPackedRecurrentLayer.cpp是仿照RecurrentLayer.cpp写的,但RecurrentLayer.cpp里面很多取名随意、代码冗余的情况,建议不要在MKLPackedRecurrentLayer.cpp出现。
- 设计文档中使用 cblas_?gemm API, 但代码中只使用了 cblas_sgemm,请说明下原因,或者修改设计文档。
MatrixPtr batch1 = | ||
batchValue_->getBatchValue(n - 1, batch2->getHeight()); | ||
|
||
// batch2->mul(*batch1, *weight_->getW(), 1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
62行不要的代码可以删掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,谢谢
REGISTER_TIMER_INFO("RecurrentFwBatch", getName().c_str()); | ||
/* forward one batch */ | ||
for (size_t n = 0; n < batchValue_->getNumBatch(); n++) { | ||
MatrixPtr batch2 = batchValue_->getBatchValue(n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch2, batch1的名字不好
batch2->batch
batch1->pre_batch
backward里面同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
for (size_t j = 0; j < batch2->getWidth(); j++) { | ||
*(batch2->getData() + i * batch2->getWidth() + j) = | ||
*(batch2->getData() + i * batch2->getWidth() + j) > 0 | ||
? *(batch2->getData() + i * batch2->getWidth() + j) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- getHeight(), getWidth(), getData(),都可以用一个临时变量来先定义在循环外面,避免循环里面不停地访问指针。
- 不同batch的getWidth()应该都是一样的,可以定义在大循环外面
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
for (size_t n = 0; n < batchValue_->getNumBatch(); n++) { | ||
MatrixPtr batch2 = batchValue_->getBatchValue(n); | ||
|
||
if (n != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的条件判断和70行的判断条件有关联么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没有关联
for (size_t i = 0; i < batch2->getHeight(); i++) { | ||
for (size_t j = 0; j < batch2->getWidth(); j++) { | ||
*(batch2->getData() + i * batch2->getWidth() + j) = | ||
*(batch2->getData() + i * batch2->getWidth() + j) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么小于0的时候,直接赋值为0呢
66-71行的作用是什么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经删除这段代码
dst->getWidth()); | ||
} | ||
|
||
void compute(size_t M, real *A, size_t lda, real *C, size_t ldc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
60行这个函数有使用到么,如果没有使用过,请删除。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
void pack() { pack_(weight_); } | ||
|
||
void compute(MatrixPtr dst, MatrixPtr src) { | ||
cblas_sgemm_compute(CblasRowMajor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前的设计文档中,https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/mkl/mkl_packed.md#background
使用的是 cblas_?gemm, 为什么这里只使用cblas_sgemm,其他的三个数据类型呢?下同。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我们在设计文档的时候考虑了三种数据类型,但是目前MKLPacked*Layer 只支持float 数据类型 (sgemm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
关于这一点,文档中还是留出了余地,不然以后如果要支持double类型,就又要改文档了。
class MKLPackedWeight { | ||
protected: | ||
real *weight_; | ||
real *packedWeight_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
26行求注释
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
@@ -12,6 +12,7 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||
See the License for the specific language governing permissions and | |||
limitations under the License. */ | |||
|
|||
#include "RecurrentLayer.h" | |||
#include <gflags/gflags.h> | |||
#include "Layer.h" | |||
#include "SequenceToBatch.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请简化头文件,16-18行的头文件在RecurrentLayer.h已经有了,这里不需要再写一遍。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
#include "MKLPackedWeight.h" | ||
#include "RecurrentLayer.h" | ||
#include "SequenceToBatch.h" | ||
#include "paddle/utils/Stat.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请简化头文件,17,18,21行在RecurrentLayer.h已经包含了,不需要再写一遍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luotao1
请review 新的comment 和 code
MatrixPtr batch1 = | ||
batchValue_->getBatchValue(n - 1, batch2->getHeight()); | ||
|
||
// batch2->mul(*batch1, *weight_->getW(), 1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,谢谢
for (size_t j = 0; j < batch2->getWidth(); j++) { | ||
*(batch2->getData() + i * batch2->getWidth() + j) = | ||
*(batch2->getData() + i * batch2->getWidth() + j) > 0 | ||
? *(batch2->getData() + i * batch2->getWidth() + j) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
arg.grad = batch2; | ||
activation_->backward(arg).check(); | ||
|
||
if (n != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个判断不能移出去,这个是判断循环是不是从state=0开始的,state=0 只做激活,否则做矩阵乘和激活。
|
||
if (n != 0) { | ||
batch1 = batchGrad_->getBatchValue(n - 1, batch2->getHeight()); | ||
// batch1->mul(*batch2, *weightT, 1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
void pack() { pack_(weight_); } | ||
|
||
void compute(MatrixPtr dst, MatrixPtr src) { | ||
cblas_sgemm_compute(CblasRowMajor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我们在设计文档的时候考虑了三种数据类型,但是目前MKLPacked*Layer 只支持float 数据类型 (sgemm)
namespace paddle { | ||
|
||
/** | ||
* @brief MKLPackedRecurrentLayer takes 1 input layer. The output size is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
* them by rnn_use_batch flag. | ||
*/ | ||
|
||
class MKLPackedRecurrentLayer : public RecurrentLayer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override 函数也需要写注释吗?
|
||
protected: | ||
std::unique_ptr<MKLPackedWeight> packed_weight_; | ||
std::unique_ptr<MKLPackedWeight> packed_weightT_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
|
||
void pack() { pack_(weight_); } | ||
|
||
void compute(MatrixPtr dst, MatrixPtr src) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
*output_.grad->subMatrix(starts[seq], len - 1), | ||
1, | ||
1); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
精简后代码存在功能性问题,已改正。
精简后代码是否会增加后期维护和debug的难度?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating, some minor comments.
* sequence by one sequence. The other way is to reorganize the input | ||
* into batches, then compute rnn one batch by one batch. Users can select | ||
* them by rnn_use_batch flag. | ||
* @brief MKLPackedRecurrentLayer is same with RecurrentLayer but is optimized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the same with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的 谢谢
@@ -66,7 +48,10 @@ class MKLPackedRecurrentLayer : public RecurrentLayer { | |||
const int* starts) override; | |||
|
|||
protected: | |||
/// packed_weight_ is contains same data with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packed_weight_ contains the same data with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
@@ -22,7 +22,9 @@ namespace paddle { | |||
|
|||
class MKLPackedWeight { | |||
protected: | |||
/// The pointor of weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pointer, 27行同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
@@ -41,7 +43,7 @@ class MKLPackedWeight { | |||
|
|||
void pack() { pack_(weight_); } | |||
|
|||
void compute(MatrixPtr dst, MatrixPtr src) { | |||
void compute(MatrixPtr dst, const MatrixPtr src) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- compute函数名建议改成gemm_compute或者其他,目前的名称太范了。
- 一般顺序是:const MatrixPtr src, MatrixPtr dst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luotao1 code updated
* sequence by one sequence. The other way is to reorganize the input | ||
* into batches, then compute rnn one batch by one batch. Users can select | ||
* them by rnn_use_batch flag. | ||
* @brief MKLPackedRecurrentLayer is same with RecurrentLayer but is optimized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的 谢谢
@@ -66,7 +48,10 @@ class MKLPackedRecurrentLayer : public RecurrentLayer { | |||
const int* starts) override; | |||
|
|||
protected: | |||
/// packed_weight_ is contains same data with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
@@ -22,7 +22,9 @@ namespace paddle { | |||
|
|||
class MKLPackedWeight { | |||
protected: | |||
/// The pointor of weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
@@ -41,7 +43,7 @@ class MKLPackedWeight { | |||
|
|||
void pack() { pack_(weight_); } | |||
|
|||
void compute(MatrixPtr dst, MatrixPtr src) { | |||
void compute(MatrixPtr dst, const MatrixPtr src) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的谢谢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fix #6512