-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add design doc of inference API for fluid. #7315
Changes from 1 commit
5f734c1
83aa491
0bfb9eb
9337473
499ed28
f08f698
0d18fb7
baf2802
a666080
acd8131
6d8226b
339f4ed
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
# Design Doc: Inferencer | ||
|
||
In fluid, a nueral network is represented as a protobuf message [ProgramDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/program.md), the python wrapper of which is `Program`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo "nueral" => neural There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fluid => Fluid This mistake appears in many places in this document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please be aware that Fluid doesn't represent a network at all. The protobuf message represents the program, not the network. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is no network in Fluid. However, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. python => Python This mistake appears in many other places in the document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
Given a `ProgramDesc`, it can be run on any execution environment. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. on => inside There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Runtime => runtime There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
In fluid, we call the execution environment `Runtime`, which includes `Place`, `Scope` and `Executor`. | ||
|
||
## Representation of the Inference Network | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, here it is the inference program, not the inference network. |
||
|
||
In python, an inference network is defined as: | ||
|
||
```python | ||
image = fluid.layers.data(name='x', shape=[784], dtype='float32') | ||
predict = fluid.layers.fc(input=image, | ||
size=10, | ||
act='softmax') | ||
``` | ||
|
||
After training for serval passes, the parameters can be saved use `fluid.io.save_inference_model`, which will save the binary proto string of the network at the same time. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. type "serval" => several There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pass => epoch There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't use passive voice (被动语态), which is highly suppressed in English writing, unless it is really necessary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will do in following commits. |
||
```python | ||
fluid.io.save_inference_model( | ||
"./inference_model/", ["x"], [predict], | ||
exe) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will do in following commits. |
||
``` | ||
|
||
The saved model contains everything of the inference network, including all operators and variables. Thus, the `inference_program` should be initilized by the model file or a pre-loaded buffer. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "everything of the" => everything required by the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
Given a `inference_program`, it is easy to derive a `load_program` which is composed of `load_op` and is responsible for initializing all the parameter variables in `inference_program`. `load_program` will be executed once and `inference_program` will be executed as many times as you need. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "a" => an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "will" => can There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
To summerize, a inferencer should: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo "summerize" => summarize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, except the inference engine which will be improved in following commits. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- be initialized from files or from buffers | ||
- be composed of two ProgramDesc, namely the `inference_program` and `load_program` | ||
|
||
All the initialization is designed to be done in constructor. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "All the initialization is designed to be done in constructor." => All the initialization should be done in the constructor. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
## Support of Switching Runtime | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. of => for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
In fluid, the execution environment is composed of three key concepts: `Place`, `Scope` and `Executor`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add some links to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. For |
||
|
||
There are two types of Place in current framework, `CPUPlace` for CPU and `CUDAPlace` for CUDA GPU. `Scope` is independent to `Place`. Given the place, you need to define a `Executor`, and run the `Executor` among the `Scope`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Place => There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a Executor -> an Executor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
In Inferencer, the `Runtime` is declared as follows: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Inferencer => Inference Engine |
||
|
||
```c++ | ||
class Runtime { | ||
platform::Place* place; | ||
framework::Scope* scope; | ||
framework::Executor* executor; | ||
}; | ||
``` | ||
|
||
With the definition of `Runtime`, the `Inferencer` will has following features: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Inferencer => Inference Engine will has => will have the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- **Switch runtime**. Different `Runtime` can have either different of the same type of `Place`, with different `Scope` and `Executor`. An `Inferencer` can run on different `Runtime` at the same time independently. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rephrase: Switch runtime. Different There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- **Share parameters among different networks**. Users can run different `Inferencer`, which means different network, on the same `Runtime`, parameters with the same name will be shared. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- **Share parameters among different threads**. Multi-threads can be launched to run an `Inferencer` in parallel on the same `Runtime`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's decide on a standard term for |
||
|
||
## Overview of the Inference API | ||
|
||
A simple design, users can use the core data structure, `Tensor` and `LoDTensor`, to feed input data and fetch output data. | ||
An `Inferencer` should enable the following members and public interfaces: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. enable => support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- Members: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pointer of the => pointer to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- the pointer of the `inference_program` | ||
- the pointer of the `load_program` | ||
- vectors of string to record the `feed_var_names` and `fetch_var_names` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to record the => to store the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- the pointer of current `Runtime` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- Important interfaces: | ||
- constructor, to initialize the `inference_program` and `load_program`. Once initialized, they cannot be changed. | ||
- `Run`, to run the inference based on the current runtime. | ||
- `SetRuntime`, to set the current runtime. When the runtime is set, the `load_program` will be run once to load parameters from files. | ||
- Utility interfaces: | ||
- `GetFeed/FetchVarNames`, to help users to debug. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to help users to debug => to help users debug There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- `GetFeed/FetchVarShape`, to help users to verify the size of input and output data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to help users to verify => to help users verify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
```c++ | ||
class Inferencer { | ||
public: | ||
// Initialize from file | ||
Inferencer(const std::string& filename); | ||
// Initialize from buffer | ||
Inferencer(const char* buffer, const size_t num_bytes); | ||
|
||
void SetRuntime(Runtime* runtime); | ||
|
||
void Run(const std::vector<framework::Tensor>& feeds, | ||
std::vector<framework::Tensor>& fetchs); | ||
|
||
// utility inferfaces | ||
std::vector<std::string>& GetFeedVarNames() const; | ||
std::vector<std::string>& GetFecthVarNames() const; | ||
std::vector<int64_t> GetFeedVarShape(const size_t index); | ||
std::vector<int64_t> GetFetchVarShape(const size_t index); | ||
|
||
private: | ||
framework::ProgramDesc* inference_program_; | ||
framework::ProgramDesc* load_program_; | ||
std::vector<std::string> feed_var_names_; | ||
std::vector<std::string> fetch_var_names_; | ||
|
||
Runtime* runtime; | ||
}; | ||
``` | ||
|
||
### Issues | ||
|
||
- Normally, all fetching variables' names should be written in the ProgramDesc and read from file. If users want to add some extra fetching variables for debug, or for some other use, they need to regenerate the file again. Do we need to allow user to append extra fetching variables? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for debug => for debugging purposes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- How to support multi-devices? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inferencer => Let's decide on a new term for this, maybe Inference Engine ?
I am replacing this with Inference Engine in my review right now, but if people decide on something else, we can replace it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for Inference Engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about it and will improve it in following commit.