-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/evaluator #5331
Feature/evaluator #5331
Conversation
doc/design/evaluator.md
Outdated
### Evaluator Design | ||
Currently, every operation is expressed in the graph. we divide the evaluator process into three steps. | ||
|
||
1. Initialize the metric state necessary and add it into the block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metric state necessary -> metric state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/design/evaluator.md
Outdated
|
||
1. Initialize the metric state necessary and add it into the block. | ||
|
||
2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once.\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
doc/design/evaluator.md
Outdated
""" | ||
pass | ||
|
||
def _clear_state(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should not be a part of Python.
doc/design/evaluator.md
Outdated
""" | ||
pass | ||
|
||
def _append_evalutor_op(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we just need to specify what the interface looks like to the user.
I think a very important general rule is: don't make the decision unless we absolutely have to. By deferring the decision about what private methods to use for as long as possible gives us more information to make a good decision.
2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once. | ||
|
||
|
||
3. Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. When it comes to distributed training/Multi-GPU training, aggregate the value from different devices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the code below, it seems that we need some detailed description of how to implement evaluator operators in C++, how to save state in C++ and update them in python side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will add the detail.
Currently, we have two options.
option 1, just like the TensorFlow does, composing the low-level operators to compute metric. If the performance is a real bottleneck, rewrite them in the c++ side as a new operator.
option 2, we use c++ operator to calculate every mini-batch metric and maintain some states in Python side.
I'm not sure which is better now. I implement the option 2, how do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
else: | ||
reset_program = program | ||
for k, var in self._states.iteritems(): | ||
zeros = helper.create_tmp_variable(dtype=var.data_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helper
can only create vars and ops in main_program
and startup_programe
. However, here we need to create them in the reset_program
. So it's not correct to use helper
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# """ | ||
# raise NotImplementedError() | ||
|
||
def reset(self, executor, program=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not expose reset_program to the upper level. It should be created, run and destroyed in the reset
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
""" | ||
raise NotImplementedError() | ||
|
||
def reset(self, executor, program=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reset_program=None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
return acc_out | ||
|
||
def eval(self, executor, program=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval_program=None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -31,6 +32,8 @@ | |||
main_program=main_program, | |||
startup_program=startup_program) | |||
|
|||
accuracy = evaluator.Accuracy(input=y_predict, label=y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evaluator.accuracy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved just because @Canpio approved.
fix #3808
The previous evaluator doesn't separate compile time and runtime, and also we need to create some metric states for each evaluator.