Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show the moving average loss #189

Merged
merged 2 commits into from
Feb 16, 2023
Merged

Conversation

shirayu
Copy link
Contributor

@shirayu shirayu commented Feb 14, 2023

This PR replaces the loss with the moving average of loss.
Current calculation of loss for logging resets at the begging of each epoch.
This causes jumping of loss.

Images

Before

Before

After

After

@kohya-ss
Copy link
Owner

Thank you for this! Moving average seems to be good.
However, I think it will be better that loss/epoch is the average loss of the epoch, not the average loss of entire training.

In addition, it is a bit annoying to calculate the sum of losses every step, if the epoch has thousands of steps. TensorBoard has a 'smoothing' feature, so it will be acceptable raw loss values instead of average...

@shirayu
Copy link
Contributor Author

shirayu commented Feb 14, 2023

loss/epoch is the same value to the moving average because window size is the number of steps of an epoch.

Surely, we can also watch the raw loss value and smoothed loss value in TensorBoard.

My first intention of this PR was the improvement of CLI loss indication.
I prefer that the loss value displayed on the screen with the progress bar be continuous.

As you point out, it is wasteful to sum every step.
I will try improving it a little.

@shirayu
Copy link
Contributor Author

shirayu commented Feb 14, 2023

I removed the call of sum() at 8aed512.

@kohya-ss
Copy link
Owner

loss/epoch is the same value to the moving average because window size is the number of steps of an epoch.

Oh, sorry, I misunderstood.

I prefer that the loss value displayed on the screen with the progress bar be continuous.

I agree this.

The update looks good! I will merge as soon as I have time.

@kohya-ss kohya-ss changed the base branch from main to dev February 16, 2023 13:00
@kohya-ss kohya-ss merged commit 914d150 into kohya-ss:dev Feb 16, 2023
@shirayu shirayu deleted the improve_loss_track branch February 16, 2023 17:31
wkpark pushed a commit to wkpark/sd-scripts that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants