-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logic for shifting labels before computing metrics #1913
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test please? Otherwise LGTM. Will approve after test
…mposer into alex/causal-roll-labels-1691
Typo fix Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left one comment to cleanup the fixture usage a bit (sorry, i pointed you to the wrong place for the usage you wanted)
…mposer into alex/causal-roll-labels-1691
What does this PR do?
Adds a
shift_labels
init argument toHuggingFaceModel
class. This instructs the model whether to shift labels by one token before computing metrics, which mimics the way HF Causal LM classes handle labels when computing loss. This fixes the current implementation, which never does this shifting and produces incorrect metric results for Causal LMs.If
shift_labels
is not specified,HuggingFaceModel
will try to infer the correct behavior based on whether the model is an instance of a registered HF Causal LM class (or a subclass of one).What issue(s) does this change relate to?
https://mosaicml.atlassian.net/browse/CO-1691
Before submitting
pre-commit
on your change? (see thepre-commit
section of prerequisites)