Handle feature reduction properly in LLLA #169

wiseodd · 2024-04-25T16:32:16Z

Say, you have an LLM with a regression head on top. Then this code

f, phi = self.model.forward_with_features(x)
print(f.shape, phi.shape)
for p in self.model.last_layer.parameters():
    print(p.shape)
input()

outputs

torch.Size([4, 2]) torch.Size([4, 9, 768])
torch.Size([2, 768])

There is a mismatch between the last layer's inputs's dim and the last layer itself.

The best solution seems to let the user pass what kind of reduction they use. Common choices: first (the <CLS> token in BERT), last (in causal LMs), average (https://arxiv.org/abs/2402.05015). We can use enum for this.

The text was updated successfully, but these errors were encountered:

wiseodd added the enhancement New feature or request label Apr 25, 2024

wiseodd added this to the 0.2 milestone Apr 25, 2024

wiseodd self-assigned this Apr 25, 2024

wiseodd mentioned this issue Apr 25, 2024

Add an option to reduce LLM features in LLLaplace #172

Merged

wiseodd closed this as completed in #172 Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle feature reduction properly in LLLA #169

Handle feature reduction properly in LLLA #169

wiseodd commented Apr 25, 2024

Handle feature reduction properly in LLLA #169

Handle feature reduction properly in LLLA #169

Comments

wiseodd commented Apr 25, 2024