-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Free outputs callback #2598
Free outputs callback #2598
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but wait for someone more familiar with callbacks to also approve
Nit: isn't free a bit of an overloaded term in ML? Or in general? What about "releaseOutputs"? That's a bit more of a C++ pointer terminology here but its a bit more exact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM added some nits
Co-authored-by: Charles Tang <j316chuck@users.noreply.github.com>
What does this PR do?
Adds callback to support freeing outputs for memory savings. When not using
train_metrics
,self.state.outputs
are not needed. However, they may take up a non-trivial amount of memory (seq_length*vocab_size*microbatch_size*bytes_per_param
). For certain long sequence models, this memory starts to matter (~1-2GB), so having an option to free the memory is useful.Existing tests (eg
TestCallbackTrains
) should be sufficient for this PR.What issue(s) does this change relate to?
GRT-2464