Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flattern weight of lstm #27192

Merged
merged 12 commits into from
Oct 12, 2020
Merged

Conversation

GaoWei8
Copy link
Contributor

@GaoWei8 GaoWei8 commented Sep 8, 2020

PR types

New features

PR changes

OPs

Describe

  • 将lstm cudnn的大块weight改成weight list 输入。如果python端使用的相邻内存,c++ 端直接只用首指针和大小调用;否则需要在c++端复制成一大块内存。

  • test模式下,若提供了W并且已初始化就优先使用W,否则使用WeightList,并将转换得到的参数保存在W中。

需要approve的内容

  1. You must have one RD (cyj1986, Superjomn) approval for the changes of Inputs/Output/Attrs of OPs. The changes of OPs will cause that the new version inference fails to load model trained by the old version. Please modify your code.

  2. You must have one RD (zhiqiu (Recommend) or phlrain) approval for the api change for the opreator-related api without 'core.ops'.
    fluid.layers.lstm接口接下来会废除,使用PR27217中定义的新接口。

  3. You must have one RD (XiaoguangHu01,Xreki,luotao1) approval for the usage (either add or delete) of const_cast.

  4. Using ShareDataWith or ShareBufferWith is not recommended. You must have one RD's (zhhsplendid (Recommend), zhiqiu or luotao1 or lanxianghit) approval to use these methods. For more information, please refer to https://github.com/PaddlePaddle/Paddle/wiki/ShareDataWith-is-prohibited-in-OP. The error lines are as follows:

  5. It is an Op accuracy problem, please take care of it. You must have one RD (zhangting2020 (Recommend), luotao1 or phlrain) approval for the usage (either add or delete) of @skip_check_grad_ci. For more information, please refer to: https://github.com/PaddlePaddle/Paddle/wiki/Gradient-Check-Is-Required-for-Op-Test.

接口兼容性问题的论证

  • 增加的Reserve,StateOut输出接口是为了支持cudnn lstm C++ kernel在动态图中的使用。

  • 原有lstm接口的双向在计算结果的维度会产生错误。

  • 原有lstm接口的多层的结果有问题,原有接口一直用的输入是padding数据。但是用的cudnn的接口是处理unpadding数据的接口,虽然可以调用原有API进行计算,但是计算的结果和精度都是存在问题的。目前自有模型库仅存在1个调用原有API进行计算,且为多层的计算,应该修改为使用新接口计算。

  • 外部用户因为API计算错误不可能在用。附上两个外部用户提出的issue lstm错误 #24300 fluid.layers.lstm接口参数is_bidirec不能发挥双向的作用 #22979

  • 所以计划lstm op在2.0会进行大幅度的修改。以后也不会推荐用老的API接口,而是使用新增的API接口。

@paddle-bot-old
Copy link

paddle-bot-old bot commented Sep 8, 2020

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle/fluid/operators/cudnn_lstm_op.cu.cc Outdated Show resolved Hide resolved
paddle/fluid/operators/cudnn_lstm_op.cu.cc Outdated Show resolved Hide resolved
paddle/fluid/operators/cudnn_lstm_op.cu.cc Outdated Show resolved Hide resolved
@@ -271,6 +363,8 @@ class CudnnLSTMGPUGradKernel : public framework::OpKernel<T> {
"of cudnn is larger than 7.2.1"));
#endif
}
weight_to_tensor_list<T>(place, stream, &weight_grad_list, weight_list,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否有更好的处理方式呢,感觉如果weight是不连续的grad才需要拷贝,如果weight是连续的话grad最好也是连续的,可能还是sharedata好些

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已设置成和weight相同的策略,只有不连续时才拷贝。

@guoshengCS
Copy link
Contributor

guoshengCS commented Sep 16, 2020

是否需要同时使用WeightListW呢,这里有两点考虑:

  1. 兼容性,如果直接去掉W可能导致原来保存的C++预测模型使用不了

  2. 预测性能,C++预测时用户无法像python中那样调用flatten_parameters使得WeightList中的参数内存连续,这会使得在预测时OP内每次都进行拷贝,如果保留W,其中保存转换后的参数,test模式下只在W未初始化时进行拷贝,这样能够避免每次拷贝。

test模式下,若提供了W并且已初始化就优先使用W,否则使用WeightList,并将转换得到的参数保存在W中。

}

bool grad_continuous =
is_continuous<T, std::vector<Tensor *>>(weight_grad_list);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

由于weight_grad_list中的各weight grad是由weight_grad_list[i]->mutable_data<T>(place)分配的,这里很大可能会是不连续的,可能还是会每次都拷贝梯度。可否直接使用大块的weight grad,然后各小weight grad从中ShareDataWith。

Copy link
Contributor Author

@GaoWei8 GaoWei8 Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

尝试过,输入是weight list,在c++端只能也用weight list的方式得到grad的计算。但是python端应该可以通过一些方式,把grad的内存分配为连续内存。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改为输入是weight list与大W共享内存的方式。

auto weight_list = ctx.MultiInput<framework::Tensor>("WeightList");
W->mutable_data<T>({weight_numel}, place);
weight_to_tensor<T>(place, stream, weight_list, W);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里is_test==true时是否会每次拷贝呢,可否在W未被初始化的时候拷贝呢

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python端预测时会初始化W,这时候用的是W, 而且不会拷贝数据。
C++预测时不会初始化W,用的是weight_list,但是会拷贝weight_list到W。

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Superjomn Superjomn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu self-requested a review October 12, 2020 05:17
@GaoWei8 GaoWei8 merged commit 36bb056 into PaddlePaddle:develop Oct 12, 2020
chen-zhiyu pushed a commit to chen-zhiyu/Paddle that referenced this pull request Oct 15, 2020
@GaoWei8 GaoWei8 deleted the flattern_weight branch January 7, 2021 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants