Allow `LPLayerNorm` and `LPGroupNorm` to support `self.bias` or `self.weight` = None #2044

abhi-mosaic · 2023-03-08T02:32:27Z

We are experimenting with removing all biases from our MosaicGPT models. When we do so with torch.nn.LayerNorm it works, but with LPLayerNorm it fails. This PR:
(1) Modifies the weight copying from torch.nn.LayerNorm to LPLayerNorm that occurs during module surgery to check for None types.
(2) Adds a check in the forward() method if self.bias is None or self.weight is None, and doesn't downcast the None parameters.
(3) Updates a test to run on a model where some LayerNorms have both weights and biases, some have self.bias = None, and some have self.bias = None and self.weight = None.

This PR also does (1), (2), and (3) for LPGroupNorm.

abhi-mosaic · 2023-03-08T02:35:06Z

@bandish-shah there is a moderate likelihood that our MosaicGPT in examples will need this feature, so until this patch is released in Composer, we likely cannot tag a stable release of Examples against a stable release of Composer (or we will need to maintain a second branch or something like that)

Basically this might be worth an 0.13.2, if you have other stuff you want to get in there.

mvpatel2000 · 2023-03-08T03:30:06Z

@abhi-mosaic Can you please add a unit test ensuring this works? Similarly, can you please add the same fix and unit test to LowPrecisionGroupNorm? it's probably something simple like extending existing models they use in unit tests to have a flag for bias or not (with default on).

Basically this might be worth an 0.13.2, if you have other stuff you want to get in there.

We do have a lot of checkpointing fixes coming in this week....

bandish-shah · 2023-03-08T05:58:37Z

Basically this might be worth an 0.13.2, if you have other stuff you want to get in there.

@abhi-mosaic let's hold off, there are other fixes in progress. We can target the 0.13.2 hot patch for late next week.

composer/algorithms/low_precision_layernorm/low_precision_layernorm.py

…rnorm.py Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>

mvpatel2000

LGTM! Thanks for finishing this

abhi-mosaic · 2023-03-13T23:22:13Z

Thanks @nik-mosaic !!!

….weight` = None (#2044) Extends support to affine=False and bias-free models. --------- Co-authored-by: nik-mosaic <101217697+nik-mosaic@users.noreply.github.com> Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>

mvpatel2000 and others added 8 commits March 5, 2023 10:36

extend test and patch bug (mosaicml#2028)

da733c7

Protect for missing slack_sdk import (mosaicml#2031)

a969af8

switch code quality workflow to dev target and smoketest (mosaicml#2032)

9f82933

Generate composer PyPi package (mosaicml#2034)

9f1ff7a

test alert from global rank 0 only (mosaicml#2035)

82eaf5a

Bump version to 0.13.1 (mosaicml#2033)

8e83ff8

fix

39ccb10

Merge branch 'dev' into abhi/lpln_bias

e4a40eb

abhi-mosaic self-assigned this Mar 8, 2023

abhi-mosaic requested review from dskhudia, mvpatel2000 and nik-mosaic as code owners March 8, 2023 02:32

vchiley reviewed Mar 9, 2023

View reviewed changes

composer/algorithms/low_precision_layernorm/low_precision_layernorm.py Outdated Show resolved Hide resolved

dakinggg and others added 5 commits March 9, 2023 19:54

Merge branch 'dev' into abhi/lpln_bias

9b480ee

Update composer/algorithms/low_precision_layernorm/low_precision_laye…

a20c584

…rnorm.py Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>

Merge branch 'dev' into abhi/lpln_bias

4b0d48b

Update LPGN

90094c7

Add tests for no bias/no weight Lnorms

b0a5f7e

nik-mosaic changed the title ~~Allow LPLayerNorm to support self.bias=None~~ Allow LPLayerNorm to support self.bias or self.weight =None Mar 13, 2023

nik-mosaic changed the title ~~Allow LPLayerNorm to support self.bias or self.weight =None~~ Allow LPLayerNorm to support self.bias or self.weight = None Mar 13, 2023

nik-mosaic changed the title ~~Allow LPLayerNorm to support self.bias or self.weight = None~~ Allow LPLayerNorm and LPGroupNorm to support self.bias or self.weight = None Mar 13, 2023

mvpatel2000 approved these changes Mar 13, 2023

View reviewed changes

Merge branch 'dev' into abhi/lpln_bias

c65beb7

nik-mosaic enabled auto-merge (squash) March 13, 2023 23:29

nik-mosaic approved these changes Mar 13, 2023

View reviewed changes

nik-mosaic disabled auto-merge March 13, 2023 23:33

nik-mosaic enabled auto-merge (squash) March 13, 2023 23:34

nik-mosaic merged commit b6ba72e into mosaicml:dev Mar 13, 2023

abhi-mosaic mentioned this pull request Mar 14, 2023

Add device and dtype back to LPLayerNorm #2067

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow `LPLayerNorm` and `LPGroupNorm` to support `self.bias` or `self.weight` = None #2044

Allow `LPLayerNorm` and `LPGroupNorm` to support `self.bias` or `self.weight` = None #2044

abhi-mosaic commented Mar 8, 2023 •

edited by nik-mosaic

Loading

abhi-mosaic commented Mar 8, 2023 •

edited

Loading

mvpatel2000 commented Mar 8, 2023

bandish-shah commented Mar 8, 2023

mvpatel2000 left a comment

abhi-mosaic commented Mar 13, 2023

Allow LPLayerNorm and LPGroupNorm to support self.bias or self.weight = None #2044

Allow LPLayerNorm and LPGroupNorm to support self.bias or self.weight = None #2044

Conversation

abhi-mosaic commented Mar 8, 2023 • edited by nik-mosaic Loading

abhi-mosaic commented Mar 8, 2023 • edited Loading

mvpatel2000 commented Mar 8, 2023

bandish-shah commented Mar 8, 2023

mvpatel2000 left a comment

Choose a reason for hiding this comment

abhi-mosaic commented Mar 13, 2023

Allow `LPLayerNorm` and `LPGroupNorm` to support `self.bias` or `self.weight` = None #2044

Allow `LPLayerNorm` and `LPGroupNorm` to support `self.bias` or `self.weight` = None #2044

abhi-mosaic commented Mar 8, 2023 •

edited by nik-mosaic

Loading

abhi-mosaic commented Mar 8, 2023 •

edited

Loading