-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix activation cpu offloading #2724
Conversation
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a unit test for this? Maybe just take a run in test_fsdp
and add activation ckpting
@mvpatel2000 added unit test and inline import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move test to test_fsdp
?
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
This PR fixes activation cpu offloading. The original implementation 1) breaks when offload_to_cpu is enabled and activation_checkpointing is disabled 2) does not offload to cpu when activation_checkpointing and offload_to_cpu are both enabled. See pytorch/pytorch#85459.