Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc: operator based parameter server. #3747

Merged
merged 6 commits into from
Sep 9, 2017

Conversation

helinwang
Copy link
Contributor

@helinwang helinwang commented Aug 29, 2017

Here could be easier to review


## Abstract

We propose an approach to implment the parameter server. In this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

Below is an example of converting the user defined graph to the
sub-graphs for the trainer and the parameter server:

<img src="src/local-graph.png" width="300"/>
Copy link
Member

@jacquesqiao jacquesqiao Aug 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W is also an input of parameter update op

@putcn
Copy link
Contributor

putcn commented Aug 30, 2017

if I understand correctly, for SEND op, it only sends graph, for RECEIVE op, it only receives gradient. any given worker sees only part of the whole graph, but how would the training data travel through the whole graph? would there be some part of the graph idle until data reaches it's parent?


1. The parameter variable W and it's optimizer subgraph are placed on the parameter server.
1. Operators are added to the sub-graphs.
- *send* operator sends data and sender's address to the destination.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: if there are multiple parameters(or variables) to send to parameter server, are we:

  1. create multiple Send operators for each variable or,
  2. create one Hash operator to divide parameters equally and one Sender operator to do send.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also, maybe add some description about the send, recv operators like:

  • Send:
    • Inputs:
    • Outputs:
    • Description:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same confusion with @typhoonzero , maybe we need an operator to sharded parameters.

For others, from @typhoonzero

create multiple Send operators for each variable

Maybe we only need one Sender operator. If we have too many parameters, too much Sender operator will cased too much connection to parameter server.

Copy link
Contributor Author

@helinwang helinwang Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@typhoonzero @Yancey1989 sorry, the PR could be more clear:
In short, the answer is "create multiple Send operators for each variable".

From the graph's perspective, the Send and Recv OP are one for each variable (but not one per replica: different (trainer) replicas share one Send and Recv for each variable).
In the implementation detail, we could group send implementations to a single port handler.

In this design the variable placement is done by the graph converter before the graph is sent to worker, so it's not a runtime concept like a Hash operator. I think the "Hash" solution is for the simplest element-wise optimization case. If we want the parameter server to do things more than element-wise operation, we need to decide the parameter variable and OP placement before the graph is sent to worker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. You mean people who develop graphs, do not need to look into the implement details about how we actually send and recv variables, the graph is how the calculations flow logically. But when we build and optimize the graph, we can make the actual send operation one per trainer.

Will you add some implementation thoughts in this PR or in another one?

Copy link
Contributor Author

@helinwang helinwang Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@typhoonzero Thanks for the reminder on the implementation thoughts! That's a good idea. I will perhaps not mention implementation detail in this PR, but create a separate issue discussing it. After receiving you guys' comments, I have some point need to re-think and will update this PR and create the implementation detail issue at that time.

@helinwang
Copy link
Contributor Author

helinwang commented Aug 30, 2017

@putcn Sorry, my PR could be more clear, the Send OP is for sending any tensor (not the graph), the Recv OP is for receiving any tensor. In this way data (tensor) can travel through the graph.
Yes, the Recv OP will be blocked until it receives the data, so the OPs that depends on Recv will be idle until then.

@helinwang
Copy link
Contributor Author

From @Superjom : graph拼起来之后应该还是一个可以用的graph。

typhoonzero
typhoonzero previously approved these changes Aug 31, 2017
Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dzhwinter
Copy link
Contributor

According to the current design, there is more concept need to clarify in this design doc.
1、How the scope should be implemented in a distributed environment? Every sub-graph must run with a scope to get variable value from it. How to partition the global scope?
here is some discussion related
2、The distributed graph need to compatible with Block design.

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@helinwang helinwang merged commit 99e3d1e into PaddlePaddle:develop Sep 9, 2017
@helinwang helinwang deleted the dist_op branch September 9, 2017 00:43
heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021
* add python train time eval

* add mpii infer support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants