Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle feature reduction properly in LLLA #169

Closed
wiseodd opened this issue Apr 25, 2024 · 0 comments · Fixed by #172
Closed

Handle feature reduction properly in LLLA #169

wiseodd opened this issue Apr 25, 2024 · 0 comments · Fixed by #172
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@wiseodd
Copy link
Collaborator

wiseodd commented Apr 25, 2024

Say, you have an LLM with a regression head on top. Then this code

f, phi = self.model.forward_with_features(x)
print(f.shape, phi.shape)
for p in self.model.last_layer.parameters():
    print(p.shape)
input()

outputs

torch.Size([4, 2]) torch.Size([4, 9, 768])
torch.Size([2, 768])

There is a mismatch between the last layer's inputs's dim and the last layer itself.

The best solution seems to let the user pass what kind of reduction they use. Common choices: first (the <CLS> token in BERT), last (in causal LMs), average (https://arxiv.org/abs/2402.05015). We can use enum for this.

@wiseodd wiseodd added the enhancement New feature or request label Apr 25, 2024
@wiseodd wiseodd added this to the 0.2 milestone Apr 25, 2024
@wiseodd wiseodd self-assigned this Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant