Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove eval batch split #1576

Merged
merged 3 commits into from
Sep 30, 2022
Merged

Conversation

mvpatel2000
Copy link
Contributor

We originally added eval_batch_split as a feature similar to grad_accum to be run when evaluating metrics. This is currently used in eval mode and in some algorithms.

However, with the metrics refactor, we now run train metrics at a microbatch level. This uses the cached output from the forward call in that microbatch level. Now, eval_batch_split is no longer needed in train metrics. Further, if it is triggered, it will actually cause an error as composer still returns the cached outputs. Instead, we should let any OOMs here raise all the way up to auto grad accum level.

composer/trainer/trainer.py Outdated Show resolved Hide resolved
@mvpatel2000 mvpatel2000 enabled auto-merge (squash) September 30, 2022 22:18
@mvpatel2000 mvpatel2000 merged commit 1b7ffce into mosaicml:dev Sep 30, 2022
@mvpatel2000 mvpatel2000 deleted the mvpatel2000/remove-eval branch September 30, 2022 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants