Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore consistency of hook_normalized between LayerNorm and RMSNorm #770

Open
wants to merge 4 commits into
base: dev-3.x
Choose a base branch
from

Conversation

degenfabian
Copy link
Contributor

@degenfabian degenfabian commented Nov 1, 2024

Description

This PR intends to fix issue #747. hook_normalized is applied after the gain and bias weights are used in layer_norm.py, whereas in rms_norm.py it's before. This inconsistency was fixed by moving hook_normalized before the gain and bias weights in layer_norm.py. According to @neelnanda-io this will be a breaking change, which after @bryce13950 could be worked into release 3.0. I figured since there was already so much guidance as to how to go about the change in the issue itself and since the actual change was so small, I would just go ahead and do it. I would greatly appreciate some guidance about whether or not I should add tests for this, since I am new to mech Interp and this library and wouldn't know how to go about that.

Thanks for maintaining this library!

Fixes # (issue) #747

Type of change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@degenfabian degenfabian marked this pull request as ready for review November 2, 2024 11:50
@bryce13950
Copy link
Collaborator

Thanks for putting this together! This is going to sit open for a while until we start testing some changes on 3.0. When we are ready to do that, I will get it merged right away!

@bryce13950 bryce13950 changed the base branch from main to dev-3.x December 9, 2024 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants