Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization design doc #4852

Closed

Conversation

tonyyang-svail
Copy link

@tonyyang-svail tonyyang-svail commented Oct 17, 2017

Conclusion: two separate programDesc.

@helinwang
Copy link
Contributor

helinwang commented Oct 17, 2017

  • Option 1 (two separate ProgramDesc): we need to somehow separate the ProgramDesc into two ProgramDescs, the separation process is already similar to prune, maybe we can just do prune (option 3).
  • Option 2 (run once): In my opinion we should not do this, it's possible that the user want to initialize the network again during training. Besides, should we run the OP that depends on the init OP, but not necessary to run?
  • Option 3 (Remove Init operator during Prune):
    1. we should not remove init OP if the targets contains the init OP.
    2. In my opinion we need to analyze the dependency correctly to resolve what needs to be run (this implies our dependency graph should not have cycles, but currently it does), not just do regular expression.
      If we just do regular expression, should we run the OP that depends on the init OP, but not necessary to run?

@wangkuiyi
Copy link
Collaborator

wangkuiyi commented Oct 17, 2017

From https://travis-ci.org/PaddlePaddle/Paddle/jobs/288826972#L993 , it seems that the updated Markdown file ends with none or more than one empty lines. This has to be fixed; otherwise, pre-commit will fail.

@wangkuiyi
Copy link
Collaborator

I vote for Proposal 1. -- make it run once.

@tonyyang-svail
Copy link
Author

tonyyang-svail commented Oct 17, 2017

Update: the current implementation is based on proposal 1.


## Proposals

### Proposal 1: Two seperate `ProgramDesc`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems simple and workable.

The resource which survived across multiple block/program is the Variables in Scopes.

A program is the container of multiple Blocks, more than one program seems not a problem.

@jacquesqiao
Copy link
Member

if we have a conclusion that we will use multiple programs, one is for initializing, should we update this design?

## Proposals

### Proposal 1: Two seperate `ProgramDesc`.
## Solution: Two seperate `ProgramDesc`.
Copy link
Contributor

@helinwang helinwang Oct 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who will output the two separate ProgramDescs?

I my mind:

Python ---(entire ProgramDesc) ---> Pruner ---(ProgramDesc1, ProgramDesc2)---> Executor.

We need to make sure Pruner have the global view. For example, in cluster training, the Pruner need to have the global view in order to do things like deciding parameter placement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python generates two separate ProgramDesc, one for initialization, and one for training. Both ProgramDescs contain all the parameter description. Do you think it is enough to decide parameter placement? If not, we can always send both ProgramDescs to Pruner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Python can do the separation, pruner should be able to do it as well. Is it possible to let pruner do the job?

Parameter placement is just an example supporting the argument that the pruner needs to have the global view.

## Proposals

### Proposal 1: Two seperate `ProgramDesc`.
## Solution: Two seperate `ProgramDesc`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seperate -> separate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants