[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv #46116

haohongxiang · 2022-09-16T07:03:03Z

PR types

Bug fixes

PR changes

Others

Describe

[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv

Done：
1、c_identity_op反向op - c_allreduce_sum用ProcessGroupStream下的AllReduce代替，性能提升5%左右；
2、ProcessGroupStream下新增allgather_partial算子
3、PP策略下用send/recv/allgather_partial_on_calc_stream算子代替collective下的API，避免切流引起的性能损耗；性能提升1%左右。

新老动态图精度对齐：

paddle-bot · 2022-09-16T07:03:08Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

LiYuRio · 2022-09-16T07:14:07Z

python/paddle/distributed/collective.py

        if use_calc_stream:
-            task.wait()
-            return None
+            return group.process_group.recv_on_calc_stream(tensor, src)


不能在这个接口里用use_calc_stream这个语义，这个参数要被改成sync_op了，这样语义就错了

…e/Paddle into develop

… develop

…tead of send/recv (PaddlePaddle#46116)

* [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116) * [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780) * update

fix perf of pp

d4b6d1a

LiYuRio reviewed Sep 16, 2022

View reviewed changes

haohongxiang added 3 commits September 21, 2022 09:30

Merge branch 'develop' into fix_perf_of_pp

c2ce98c

update

3ba33e2

update

34566e4

LiYuRio previously approved these changes Sep 22, 2022

View reviewed changes

FeixLiu and others added 9 commits September 22, 2022 15:50

sync recv for 1f1b

edf5c2f

update

1398183

for general pp

98229c6

add assertion

194b8ba

Merge commit 'refs/pull/46399/head' of https://github.com/PaddlePaddl…

54f1d1b

…e/Paddle into develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

cc52bea

… develop

Merge branch 'develop' into fix_perf_of_pp

0b5de91

Merge branch 'develop' into fix_perf_of_pp

e94606e

update

17a9161

haohongxiang dismissed LiYuRio’s stale review via 17a9161 September 28, 2022 09:37

haohongxiang added 7 commits September 29, 2022 08:40

update

a61850e

update c_identity

06b4e10

update

57f001a

update

8404c72

update

0e41620

update

719652b

Merge branch 'develop' into fix_perf_of_pp

f12419e

FeixLiu approved these changes Oct 8, 2022

View reviewed changes

FeixLiu merged commit 8c0529f into PaddlePaddle:develop Oct 8, 2022

haohongxiang added a commit to haohongxiang/Paddle that referenced this pull request Oct 17, 2022

[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream ins…

c59ee6d

…tead of send/recv (PaddlePaddle#46116)

haohongxiang mentioned this pull request Oct 18, 2022

[cherry-pick] Fix perf issues of mp/pp/fuse in eager mode #47071

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv #46116

[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv #46116

haohongxiang commented Sep 16, 2022 •

edited

Loading

paddle-bot bot commented Sep 16, 2022

LiYuRio Sep 16, 2022

haohongxiang Sep 22, 2022

[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv #46116

[Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv #46116

Conversation

haohongxiang commented Sep 16, 2022 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Sep 16, 2022

LiYuRio Sep 16, 2022

Choose a reason for hiding this comment

haohongxiang Sep 22, 2022

Choose a reason for hiding this comment

haohongxiang commented Sep 16, 2022 •

edited

Loading