Log training metrics to visualize them via tensorboard #5422

tabergma · 2020-03-13T14:04:39Z

Proposed changes:
Related to https://github.com/RasaHQ/research/issues/56

Add option tensorboard_log_directory to DIETClasifier, ResponseSelector and
TEDPolicy.

By default tensorboard_log_directory is None. If a valid directory is provided,
metrics are written during training. After the model is trained you can take a look
at the training metrics in tensorboard. Execute tensorboard --logdir <path-to-given-directory>.

We also write down a model summary (layers, input size, type) in the provided directory.

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

dakshvar22

Left some comments. One meta comment: Right now we are logging at an epoch level. I think the default behaviour should be that. Can we add an optional parameter which lets you switch the behaviour to logging at a minibatch level. In my experience, logging at minibatch level can help sometimes, so if it is an easy addition then we should implement it.

changelog/5422.feature.rst

rasa/utils/tensorflow/models.py

tabergma · 2020-03-16T09:24:15Z

Just to be sure: How would you log on minibatch level? Would you have a counter that increases over time so that you see all steps in one plot? For example, if the first minibatch has 64 steps, the second minibatch would start with step 65 in the plot.

dakshvar22 · 2020-03-17T12:13:07Z

@tabergma Just tried on the scaffold project dataset - The steps for test logger do not correspond to the steps for the train logger. For example, if evaluate_after_every_epochs is set to 20, then the steps for test logger should be 20,40,60 rather than 1,2,3. Sharing the step size would directly give an indication of which epoch does the validation loss correspond to. Otherwise the user has to calculate that on their own. Attaching a screenshot -

If this is easy enough to change, I would suggest we change it now.

tabergma · 2020-03-17T12:15:42Z

This only happens if you set tensorboard_log_level to minibatch correct? So in that case, training metrics should be plotted after every step, but evaluation metrics just after every epoch?

dakshvar22 · 2020-03-17T12:32:57Z

Yes, setting it to epochs works fine. For your question regarding minibatch - IMO they should be plotted after every evaluate_after_every_epochs steps. So if evaluate_after_every_epochs is set to 2, and number of minibatches in training set is 20 and number of minibatches in validation set is 2 then steps for test logger would be 40,41,80,81,120,121

tabergma · 2020-03-17T12:33:21Z

Update:
Do you mean something like this?

dakshvar22 · 2020-03-17T12:35:45Z

Yes, there should be some inherent corresponding steps.

rasa/utils/tensorflow/models.py

dakshvar22

Looks good, added two minor comments. I think we should add a simple test as well which atleast checks if the folders are created.

rasa/utils/tensorflow/models.py

dakshvar22

Looks great! 💯 Thanks for adding support for this! 🚀

tabergma added 7 commits March 13, 2020 14:39

Write tensorboard log files.

f13e3bc

add changelog

5c2b620

don't log tf.function

10e40f7

update docs

6ea1a34

clean up

e9707b5

make method static

b6f3ccd

update changelog file name.

3b34ade

tabergma requested a review from dakshvar22 March 13, 2020 14:06

tabergma added 3 commits March 13, 2020 15:22

add tensorboard log dir option

49b5d0f

add method to write model summary

6703790

do not change default value

45f7622

dakshvar22 requested changes Mar 14, 2020

View reviewed changes

changelog/5422.feature.rst Show resolved Hide resolved

changelog/5422.feature.rst Outdated Show resolved Hide resolved

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

review comments

14010e1

tabergma added 2 commits March 16, 2020 13:28

Add option to log on minibatches.

b88ee67

clean up

fad988a

tabergma requested a review from dakshvar22 March 16, 2020 12:55

tabergma self-assigned this Mar 16, 2020

fix docs

454c6f2

Merge branch 'master' into tensorboard

f283985

tabergma added 3 commits March 17, 2020 13:37

use correct step value for test curve

8e893f6

update default values

67f1bb0

Merge branch 'master' into tensorboard

c1a30b5

dakshvar22 reviewed Mar 17, 2020

View reviewed changes

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

logging on minibatches for evluation set

76e64ba

dakshvar22 requested changes Mar 17, 2020

View reviewed changes

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

rasa/utils/tensorflow/models.py Show resolved Hide resolved

tabergma added 4 commits March 17, 2020 17:50

review comments

1580d3b

add test

5d30121

fix type, style issues

d750f5d

Merge branch 'master' into tensorboard

62b9a7f

dakshvar22 approved these changes Mar 18, 2020

View reviewed changes

tabergma merged commit da25fef into master Mar 18, 2020

tabergma deleted the tensorboard branch March 18, 2020 08:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log training metrics to visualize them via tensorboard #5422

Log training metrics to visualize them via tensorboard #5422

tabergma commented Mar 13, 2020 •

edited

Loading

dakshvar22 left a comment

tabergma commented Mar 16, 2020

dakshvar22 commented Mar 17, 2020

tabergma commented Mar 17, 2020

dakshvar22 commented Mar 17, 2020

tabergma commented Mar 17, 2020

dakshvar22 commented Mar 17, 2020

dakshvar22 left a comment

dakshvar22 left a comment

Log training metrics to visualize them via tensorboard #5422

Log training metrics to visualize them via tensorboard #5422

Conversation

tabergma commented Mar 13, 2020 • edited Loading

dakshvar22 left a comment

Choose a reason for hiding this comment

tabergma commented Mar 16, 2020

dakshvar22 commented Mar 17, 2020

tabergma commented Mar 17, 2020

dakshvar22 commented Mar 17, 2020

tabergma commented Mar 17, 2020

dakshvar22 commented Mar 17, 2020

dakshvar22 left a comment

Choose a reason for hiding this comment

dakshvar22 left a comment

Choose a reason for hiding this comment

tabergma commented Mar 13, 2020 •

edited

Loading