Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow LPLayerNorm and LPGroupNorm to support self.bias or self.weight = None #2044

Merged
merged 14 commits into from
Mar 13, 2023

Conversation

abhi-mosaic
Copy link
Contributor

@abhi-mosaic abhi-mosaic commented Mar 8, 2023

We are experimenting with removing all biases from our MosaicGPT models. When we do so with torch.nn.LayerNorm it works, but with LPLayerNorm it fails. This PR:
(1) Modifies the weight copying from torch.nn.LayerNorm to LPLayerNorm that occurs during module surgery to check for None types.
(2) Adds a check in the forward() method if self.bias is None or self.weight is None, and doesn't downcast the None parameters.
(3) Updates a test to run on a model where some LayerNorms have both weights and biases, some have self.bias = None, and some have self.bias = None and self.weight = None.

This PR also does (1), (2), and (3) for LPGroupNorm.

@abhi-mosaic
Copy link
Contributor Author

abhi-mosaic commented Mar 8, 2023

@bandish-shah there is a moderate likelihood that our MosaicGPT in examples will need this feature, so until this patch is released in Composer, we likely cannot tag a stable release of Examples against a stable release of Composer (or we will need to maintain a second branch or something like that)

Basically this might be worth an 0.13.2, if you have other stuff you want to get in there.

@mvpatel2000
Copy link
Contributor

@abhi-mosaic Can you please add a unit test ensuring this works? Similarly, can you please add the same fix and unit test to LowPrecisionGroupNorm? it's probably something simple like extending existing models they use in unit tests to have a flag for bias or not (with default on).

Basically this might be worth an 0.13.2, if you have other stuff you want to get in there.

We do have a lot of checkpointing fixes coming in this week....

@bandish-shah
Copy link
Contributor

Basically this might be worth an 0.13.2, if you have other stuff you want to get in there.

@abhi-mosaic let's hold off, there are other fixes in progress. We can target the 0.13.2 hot patch for late next week.

@nik-mosaic nik-mosaic changed the title Allow LPLayerNorm to support self.bias=None Allow LPLayerNorm to support self.bias or self.weight =None Mar 13, 2023
@nik-mosaic nik-mosaic changed the title Allow LPLayerNorm to support self.bias or self.weight =None Allow LPLayerNorm to support self.bias or self.weight = None Mar 13, 2023
@nik-mosaic nik-mosaic changed the title Allow LPLayerNorm to support self.bias or self.weight = None Allow LPLayerNorm and LPGroupNorm to support self.bias or self.weight = None Mar 13, 2023
Copy link
Contributor

@mvpatel2000 mvpatel2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for finishing this

@abhi-mosaic
Copy link
Contributor Author

Thanks @nik-mosaic !!!

@nik-mosaic nik-mosaic enabled auto-merge (squash) March 13, 2023 23:29
@nik-mosaic nik-mosaic disabled auto-merge March 13, 2023 23:33
@nik-mosaic nik-mosaic enabled auto-merge (squash) March 13, 2023 23:34
@nik-mosaic nik-mosaic merged commit b6ba72e into mosaicml:dev Mar 13, 2023
bandish-shah pushed a commit that referenced this pull request Mar 14, 2023
….weight` = None (#2044)

Extends support to affine=False and bias-free models.

---------
Co-authored-by: nik-mosaic <101217697+nik-mosaic@users.noreply.github.com>
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants