[fleet_executor] Parse pipeline config #37319

FeixLiu · 2021-11-18T03:28:43Z

PR types

Others

PR changes

Others

Describe

parse pipeline config.

paddle-bot-old · 2021-11-18T03:28:47Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle/fluid/distributed/fleet_executor/runtime_graph.cc

python/paddle/fluid/executor.py

wangxicoding · 2021-11-18T11:26:57Z

paddle/fluid/distributed/fleet_executor/fleet_executor_desc.proto

@@ -27,4 +27,6 @@ message FleetExecutorDesc {
  optional int32 dp_degree = 4 [ default = 1 ];
  optional int32 mp_degree = 5 [ default = 1 ];
  optional int32 pp_degree = 6 [ default = 1 ];
+  optional int64 num_micro_batches = 7 [ default = 1 ];


大概不是num_micro_steps?

这个在python端就是global batch size / micro batch size，所以叫num_mircro_batches？一共有多少个mirco batch？其实就是num_micro_steps

wangxicoding · 2021-11-18T11:27:06Z

paddle/fluid/distributed/fleet_executor/fleet_executor_desc.proto

@@ -27,4 +27,6 @@ message FleetExecutorDesc {
  optional int32 dp_degree = 4 [ default = 1 ];
  optional int32 mp_degree = 5 [ default = 1 ];
  optional int32 pp_degree = 6 [ default = 1 ];
+  optional int64 num_micro_batches = 7 [ default = 1 ];
+  optional int64 num_slots = 8 [ default = 1 ];


这个是啥

就是一次最多能跑多少步

wangxicoding · 2021-11-18T11:30:10Z

paddle/fluid/distributed/fleet_executor/interceptor.h

@@ -96,6 +96,9 @@ class Interceptor {
  // local mailbox, written by FetchRemoteMailbox()
  // read by PoolTheMailbox()
  std::queue<InterceptorMessage> local_mailbox_;
+
+  int64_t already_run_times_{0};


这个建议加到后面的compute_interceptor中

这个是为了fake run准备的，留着吧，后面的子类可以不用？

wangxicoding · 2021-11-18T11:30:30Z

paddle/fluid/distributed/fleet_executor/runtime_graph.cc

    if (role_to_ops.find(role_id) == role_to_ops.end()) {
-      task_nodes_.emplace_back(
-          TaskNode::CreateEmptyTaskNode(role_id, cur_rank, task_id));
+      task_nodes_.emplace_back(TaskNode::CreateEmptyTaskNode(


后续可能需要有ComputeTaskNode

没理解，为啥要单搞一个新的tasknode出来？

wangxicoding

LGTM

FeixLiu added 2 commits November 18, 2021 11:27

pass pipeline config

a046fde

update vlog

9a550f8

update

b35fe4d

FeixLiu marked this pull request as draft November 18, 2021 03:57

FeixLiu marked this pull request as ready for review November 18, 2021 03:57

modify

e2ea46e

PaddlePaddle locked and limited conversation to collaborators Nov 18, 2021

PaddlePaddle unlocked this conversation Nov 18, 2021

FeixLiu requested a review from wangxicoding November 18, 2021 07:45

LiYuRio reviewed Nov 18, 2021

View reviewed changes

paddle/fluid/distributed/fleet_executor/runtime_graph.cc Outdated Show resolved Hide resolved

LiYuRio reviewed Nov 18, 2021

View reviewed changes

python/paddle/fluid/executor.py Show resolved Hide resolved

resolve comments

15a093f

wangxicoding reviewed Nov 18, 2021

View reviewed changes

wangxicoding approved these changes Nov 19, 2021

View reviewed changes

wangxicoding merged commit ca088f9 into PaddlePaddle:develop Nov 19, 2021

FeixLiu deleted the parse_pipeline_config branch November 19, 2021 04:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fleet_executor] Parse pipeline config #37319

[fleet_executor] Parse pipeline config #37319

FeixLiu commented Nov 18, 2021 •

edited

Loading

paddle-bot-old bot commented Nov 18, 2021

wangxicoding Nov 18, 2021

FeixLiu Nov 18, 2021

wangxicoding Nov 18, 2021

FeixLiu Nov 18, 2021

wangxicoding Nov 18, 2021

FeixLiu Nov 18, 2021

wangxicoding Nov 18, 2021

FeixLiu Nov 18, 2021

wangxicoding left a comment

[fleet_executor] Parse pipeline config #37319

[fleet_executor] Parse pipeline config #37319

Conversation

FeixLiu commented Nov 18, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Nov 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangxicoding left a comment

Choose a reason for hiding this comment

FeixLiu commented Nov 18, 2021 •

edited

Loading