Log multiple losses #1375

Landanjs · 2022-08-06T01:01:32Z

Resolves CO-842. Adds logic to log multiple losses. Also, adds loss dictionary to DeepLabv3+ as an example. A couple of scenarios:

1. Scalar tensor loss

This will be logged as loss/train/total. Bert example:

2. Tuple of tensor losses

Individual losses will be logged as 'loss/train/loss{i}' where i is the index of the individual loss. There will also be 'loss/train/total' which is the sum of the individual losses.

3. Dictionary of losses without total key

Individual losses will be logged as 'loss/train/{loss_name}'. There will also be 'loss/train/total' which is the sum of the individual losses.

4. Dictionary of losses with total key

Individual losses will be logged as 'loss/train/{loss_name}'. Loss given at 'loss/train/total' will be used as the loss for the backpropagation, individual losses are not summed when 'total' is present.

Questions

Assumes returned loss is a torch.Tensor, is this fine?
Should these different scenarios be documented somewhere?

abhi-mosaic · 2022-08-08T21:59:07Z

Is there any way we can get rid of the trailing / when there is only 1 loss value, like loss/train instead of loss/train/ ? I'm just wondering if there could be some parsing issues later on.

[EDIT] or wait maybe it has to be that way for WandB to put plots together in the same place? Hmmm
I wonder if we can add a default like loss/train/total that always gets logged... would match the notation a bit better too.

Also, I wonder if we can drop the _loss suffix for things like loss/train/dice rather than loss/train/dice_loss, as it seems a bit repetitive.

Landanjs · 2022-08-08T22:18:59Z

Thanks for the suggestion Abhi, good points! Yeah, that would be cleaner, let me try to set it up 🙂

Hmmm that might actually make the closure stuff easier as well!

composer/trainer/trainer.py

eracah

''

mvpatel2000 · 2022-08-16T19:10:48Z

I don't understand pyright same :')

composer/trainer/trainer.py

abhi-mosaic

This looks great to me!!

@Landanjs I'm assuming you have some convergence tests on DeepLabV3 that work? And hopefully we have some trainer tests that do loss as a Tensor, tuple, dict?

Approving now for velocity and trusting you on the tests :)

Landanjs · 2022-08-22T18:45:32Z

I added some convergence tests for tuple losses and dict losses with 'total'. TBH I do not know how to make logging, but I'm running DeepLabv3 and ResNet experiments (wandb here group by group). The results appear to be align with dev.

Landanjs · 2022-08-23T00:29:25Z

I mentioned this to Evan, but just for the record, it looks like i need someone from @mosaicml/composer-team-eng to review as well

composer/models/deeplabv3/model.py

composer/trainer/trainer.py

Landanjs added 6 commits August 6, 2022 01:00

First pass on multi loss logging

432e5d8

Refactor format data variable

24e04ae

Brute force merge with dev

4cf1d49

Try to fix dev pt. 1...

bad9c04

Try to fix dev pt. 2...

2abbe85

Use previous log key if only one loss is used

b756aca

Landanjs marked this pull request as ready for review August 8, 2022 21:10

Landanjs requested review from eracah and a team as code owners August 8, 2022 21:10

Merge branch 'dev' into landan/log_multiple_losses

f8d0df8

This comment was marked as outdated.

Sign in to view

Landanjs added 2 commits August 8, 2022 22:00

fix/break closures?

3d12bab

fix/break closures?

f6b71fc

Landanjs added 2 commits August 9, 2022 17:32

Log total loss by default

6476626

remove is instance

0a0fcbe

Landanjs requested a review from a team as a code owner August 9, 2022 17:39

Landanjs commented Aug 9, 2022

View reviewed changes

composer/trainer/trainer.py Outdated Show resolved Hide resolved

Landanjs commented Aug 9, 2022

View reviewed changes

composer/trainer/trainer.py Outdated Show resolved Hide resolved

Landanjs requested a review from abhi-mosaic August 9, 2022 22:47

Merge with dev

45e059a

eracah reviewed Aug 15, 2022

View reviewed changes

eracah requested review from eracah and removed request for eracah August 15, 2022 23:43

Landanjs added 2 commits August 16, 2022 19:08

Refactor everything

286bab0

I don't understand pyright

2f8adc9

Landanjs commented Aug 16, 2022

View reviewed changes

composer/trainer/trainer.py Show resolved Hide resolved

Fix comment

85da2ad

Landanjs commented Aug 16, 2022

View reviewed changes

composer/trainer/trainer.py Show resolved Hide resolved

abhi-mosaic approved these changes Aug 19, 2022

View reviewed changes

Landanjs added 2 commits August 22, 2022 10:31

Merge with dev

9fa1d67

Tuple and dict loss with total tests

607ea72

Landanjs removed the request for review from eracah August 22, 2022 19:52

Landanjs requested a review from eracah August 23, 2022 19:55

eracah approved these changes Aug 24, 2022

View reviewed changes

composer/models/deeplabv3/model.py Outdated Show resolved Hide resolved

composer/trainer/trainer.py Show resolved Hide resolved

Landanjs added 2 commits August 24, 2022 10:00

Update types

e2b0985

Merge with dev

2016c39

Landanjs merged commit 8f0d760 into mosaicml:dev Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log multiple losses #1375

Log multiple losses #1375

Landanjs commented Aug 6, 2022 •

edited

Loading

This comment was marked as outdated.

abhi-mosaic commented Aug 8, 2022 •

edited

Loading

Landanjs commented Aug 8, 2022 •

edited

Loading

eracah left a comment •

edited

Loading

mvpatel2000 commented Aug 16, 2022

abhi-mosaic left a comment

Landanjs commented Aug 22, 2022

Landanjs commented Aug 23, 2022

Log multiple losses #1375

Log multiple losses #1375

Conversation

Landanjs commented Aug 6, 2022 • edited Loading

1. Scalar tensor loss

2. Tuple of tensor losses

3. Dictionary of losses without total key

4. Dictionary of losses with total key

Questions

This comment was marked as outdated.

abhi-mosaic commented Aug 8, 2022 • edited Loading

Landanjs commented Aug 8, 2022 • edited Loading

eracah left a comment • edited Loading

Choose a reason for hiding this comment

mvpatel2000 commented Aug 16, 2022

abhi-mosaic left a comment

Choose a reason for hiding this comment

Landanjs commented Aug 22, 2022

Landanjs commented Aug 23, 2022

Landanjs commented Aug 6, 2022 •

edited

Loading

abhi-mosaic commented Aug 8, 2022 •

edited

Loading

Landanjs commented Aug 8, 2022 •

edited

Loading

eracah left a comment •

edited

Loading