Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When model is not on meta device, initialization should occur on compute device not CPU #1623

Merged
merged 3 commits into from
Oct 14, 2022

Conversation

abhi-mosaic
Copy link
Contributor

@abhi-mosaic abhi-mosaic commented Oct 14, 2022

This PR fixes an issue related to when CPU initialized models are passed to the Trainer and used with FSDP.

It also changes some of the internal key values to be upper case to match the existing FSDP conventions.

Finally, min_params is now set to 1e9 by default so that it does not interfere with most models.

Closes CO-1256, CO-1259

@abhi-mosaic abhi-mosaic requested a review from bcui19 October 14, 2022 00:57
@abhi-mosaic abhi-mosaic self-assigned this Oct 14, 2022
Copy link
Contributor

@bcui19 bcui19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for taking care of this quickly

@abhi-mosaic abhi-mosaic force-pushed the abhi/fsdp_bugfix_0_11 branch from e444327 to 3247944 Compare October 14, 2022 20:45
@abhi-mosaic abhi-mosaic merged commit 4fd1f34 into mosaicml:dev Oct 14, 2022
@bandish-shah bandish-shah changed the title Abhi/fsdp bugfix 0 11 When model is not on meta device, initialization should occur on compute device not CPU Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants