Implementation documentation of the dynamic RNN #7135

reyoung · 2018-01-02T07:14:49Z

No description provided.

pkuyym · 2018-01-02T07:20:18Z

doc/design/dynamic_rnn.md

+
+## A glance of Dynamic RNN
+
+A common neural network structure called recurrent neural network(`RNN` for short), which there is a directed circle in the neural network model. RNN can use a internal memory to process arbitrary sequences of inputs.


a internal memory --> an internal memory
process arbitrary sequences of inputs is not clear enough. Maybe you mean:
RNN can use an internal memory to process sequences with variable lengths.

This sentence is quoted from https://en.wikipedia.org/wiki/Recurrent_neural_network

pkuyym · 2018-01-02T07:21:34Z

doc/design/dynamic_rnn.md

+
+A common neural network structure called recurrent neural network(`RNN` for short), which there is a directed circle in the neural network model. RNN can use a internal memory to process arbitrary sequences of inputs.
+
+PaddlePaddle Fluid directly represents the `directed circle` in the `ProgramDesc`, since we do not use directed acyclic graph to represent our model. The `ProgramDesc` just like the AST of a programming language, which describes the computation instructions for training a neural network. We use arrays and a while loop to describe the training process of an RNN. The C++ code below demonstrates the forward logic of RNN which PaddlePaddle Fluid generates in `ProgramDesc`.


training process --> training/inference process
The C++ code below demonstrates the forward logic of RNN which PaddlePaddle Fluid generates in ProgramDesc --> The C++ code below demonstrates the forward logic of RNN generated in ProgramDesc of PaddlePaddle Fluid.

pkuyym · 2018-01-02T07:27:22Z

doc/design/dynamic_rnn.md

+
+1. Control flow operators
+1. Data manipulation operators of RNN.
+2. Backward of RNN.


1. Data manipulation operators of RNN. 2. Backward of RNN.

-->

2. Data manipulation operators of RNN 3. Backward of RNN

pkuyym · 2018-01-02T07:29:13Z

doc/design/dynamic_rnn.md

+
+### WhileOp
+
+The primary control flow operator to implement dynamic RNN is `WhileOp`. The `WhileOp` takes a sub-block. The operators in the sub-block will be executed again and again while the condition is true. 


The WhileOp takes a sub-block. --> The WhileOp holds a sub-block.

pkuyym · 2018-01-02T07:31:20Z

doc/design/dynamic_rnn.md

+The while operator has two kinds of inputs. They are
+
+* Condition: A bool scalar. When it's False, the While Op will be terminated. Note that this scalar should always be in CPU memory.
+  * The condition variable is in the external block. However, it should be updated inside the sub-block of while op unless it is an endless loop. The condition variable will be an output variable of the while operator, too.


in the external block you mean in the parent block ?
unless it is an endless loop. --> otherwise it would result to an endless loop.

pkuyym · 2018-01-02T07:38:43Z

doc/design/dynamic_rnn.md

+* Condition: A bool scalar. When it's False, the While Op will be terminated. Note that this scalar should always be in CPU memory.
+  * The condition variable is in the external block. However, it should be updated inside the sub-block of while op unless it is an endless loop. The condition variable will be an output variable of the while operator, too.
+* X: The external inputs variables, which are required by operators inside the block of While Op.
+  * For example, if there is a hidden fully-connected layer in while operator. The input of the fully-connected layer is calculated by another operator inside the while operator. The input of this fully-connected layer is not the `external` inputs of the while operator. However, weight tensors of this fully-connected layer are external outputs of the while operator.


The input of the fully-connected layer is calculated by another operator inside the while operator. --> The input of the fully-connected layer is output of another operator inside the while operator.

guoshengCS · 2018-01-02T08:28:04Z

Something bothers me when I am reading dynamicRNN. I guess one advantage that we use with block is to use the outer variables, however the inserted operators like lod_tensor_to_array in dynamicRNN lead to reorder the samples, while the variables outside the with block mostly keep the raw sample order. Though we can reorder the outside variables with reorder_lod_tensor_by_rank explicitly in the dynamicRNN block, from this point of view, we nearly can not use any outer variable directly in the dynamicRNN block, which seems to deviate from the with block. I am not sure if there are ways to implicitly reorder in dynamicRNN so we can use outer variables directly or should the will-be-used variables be indicated when initializing dynamicRNN.

reyoung · 2018-01-02T08:38:59Z

@guoshengCS
reorder_lod_tensor_by_rank should be wrapped in the DynamicRNN implicitly. DynamicRNN is just a syntax sugar for end users.

pkuyym · 2018-01-02T11:33:44Z

doc/design/dynamic_rnn.md

+```cpp
+auto input = LoDTensor(...);  // LoDTensor is the data structure for time series
+
+std::vector<LoDTensor> inputs_for_each_timestep = LoDTensorToTimesteps(LoDTensor())


LoDTensor() is input? If so, please use input instead.

pkuyym · 2018-01-02T11:40:32Z

doc/design/dynamic_rnn.md

+  outputs_for_each_timestep[i] = sum;
+}
+
+LoDTensor outputs = TimestepsToLoDTensor(outputs_for_each_timestep);


I think memories is also an output, however the type is vector, need convert it to LoDTensor ?

pkuyym · 2018-01-02T11:44:16Z

doc/design/dynamic_rnn.md

+
+* Output: The output variables. They are `assigned` or `push_back` by the operators inside the block of While Op.
+    * It is reasonable for `while operator` to `push_back` its output to an array because 1) the while operator is a loop. 2) the output in every timestep should not be overwritten since they will be used in backward.
+    * The condition and other control flow related operator, like `++i` or `i=0`, could be overwritten since they do not need when backwards. The backward control flow operator of `++i` is `--i`.


they do not need when backwards --> they are not required in backward stage.
The backward control flow operator of ++i is --i. --> The corresponding control flow operator of ++i in backward stage is --i.

pkuyym · 2018-01-02T11:47:40Z

doc/design/dynamic_rnn.md

+* Output: The output variables. They are `assigned` or `push_back` by the operators inside the block of While Op.
+    * It is reasonable for `while operator` to `push_back` its output to an array because 1) the while operator is a loop. 2) the output in every timestep should not be overwritten since they will be used in backward.
+    * The condition and other control flow related operator, like `++i` or `i=0`, could be overwritten since they do not need when backwards. The backward control flow operator of `++i` is `--i`.
+* The step-scopes. A vector of local scope, which size equals the step number of While Op. The i'th scope storages temporary variables generated in the i'th step.


which --> whose
equals --> equals to

pkuyym · 2018-01-02T11:49:19Z

doc/design/dynamic_rnn.md

+    * It is reasonable for `while operator` to `push_back` its output to an array because 1) the while operator is a loop. 2) the output in every timestep should not be overwritten since they will be used in backward.
+    * The condition and other control flow related operator, like `++i` or `i=0`, could be overwritten since they do not need when backwards. The backward control flow operator of `++i` is `--i`.
+* The step-scopes. A vector of local scope, which size equals the step number of While Op. The i'th scope storages temporary variables generated in the i'th step.
+    * A potential optimization of `while operator` when inference is just maintaining one step of scope in while operator since there is no backward stage when inference.


pkuyym · 2018-01-02T11:57:23Z

doc/design/dynamic_rnn.md

+
+```
+
+There are several corner cases of gradient implementation:


The followings seem not corner cases. I think it's better to call tips.

pkuyym · 2018-01-02T11:58:43Z

doc/design/dynamic_rnn.md

+The `++i` is the increment operator as a control flow operator. There are several differences between the computational `a = a + 1` and the control flow operator `++i`.
+
+1. `IncrementOp` can only be run on CPU. And it should only be run on CPU.
+2. The corresponding operator in the backward stage of `++i` is `--i`, because for the for loop, the data access should be reverse. The gradient of `++i` is not needed.


reverse --> reversed

paddle-bot-old · 2020-05-22T06:39:42Z

Since you haven't replied for a long time, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您长期未回复，我们将关闭这个issue/pr。
若问题未解决或有后续问题，请随时重新打开，我们会继续跟进。

reyoung added 6 commits December 29, 2017 14:08

Move lod_tensor.md to doc/design

b36b782

Init Commit

2961962

Init commit

d94a035

Merge branch 'develop' of github.com:baidu/Paddle into feature/dyrnn_doc

b19c74f

Update doc

bc24b66

Update dyrnn doc

490db0c

reyoung requested review from pkuyym, lcy-seso and guoshengCS January 2, 2018 07:14

pkuyym reviewed Jan 2, 2018

View reviewed changes

Update

de61fbf

Follow comment

2e5ac43

pkuyym reviewed Jan 2, 2018

View reviewed changes

paddle-bot-old bot closed this May 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation documentation of the dynamic RNN #7135

Implementation documentation of the dynamic RNN #7135

reyoung commented Jan 2, 2018

pkuyym Jan 2, 2018

reyoung Jan 2, 2018

pkuyym Jan 2, 2018

reyoung Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

reyoung Jan 2, 2018

pkuyym Jan 2, 2018

reyoung Jan 2, 2018

pkuyym Jan 2, 2018

reyoung Jan 2, 2018

guoshengCS commented Jan 2, 2018

reyoung commented Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

pkuyym Jan 2, 2018

paddle-bot-old bot commented May 22, 2020


		## A glance of Dynamic RNN

		A common neural network structure called recurrent neural network(`RNN` for short), which there is a directed circle in the neural network model. RNN can use a internal memory to process arbitrary sequences of inputs.


		A common neural network structure called recurrent neural network(`RNN` for short), which there is a directed circle in the neural network model. RNN can use a internal memory to process arbitrary sequences of inputs.

		PaddlePaddle Fluid directly represents the `directed circle` in the `ProgramDesc`, since we do not use directed acyclic graph to represent our model. The `ProgramDesc` just like the AST of a programming language, which describes the computation instructions for training a neural network. We use arrays and a while loop to describe the training process of an RNN. The C++ code below demonstrates the forward logic of RNN which PaddlePaddle Fluid generates in `ProgramDesc`.


		### WhileOp

		The primary control flow operator to implement dynamic RNN is `WhileOp`. The `WhileOp` takes a sub-block. The operators in the sub-block will be executed again and again while the condition is true.


		```

		There are several corner cases of gradient implementation:

Implementation documentation of the dynamic RNN #7135

Implementation documentation of the dynamic RNN #7135

Conversation

reyoung commented Jan 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guoshengCS commented Jan 2, 2018

reyoung commented Jan 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paddle-bot-old bot commented May 22, 2020