You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working here on a "llamacpp_HF" wrapper that allows llama.cpp to be treated like a transformers model, giving it access to the exact same samplers as models in that library. The text generation is currently functional but slow.
It works by making the forward call inside the wrapper like this:
I create a self.cache variable to be able to tell whether I need to call reset() on the llama.cpp model or not based on the provided input ids.
The next step would be to use this wrapper for perplexity evaluation. This would make a direct comparison against transformers or AutoGPTQ possible. The problem is that the call
returns a tensor with shape torch.Size([1, 1, 32000]), while my existing evaluation code, as well as a few alternative implementations that I have tried, always expect the forward call to return a tensor with shape torch.Size([1, 1200, 32000]), where the second number is the context size being used for the evaluation.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am working here on a "llamacpp_HF" wrapper that allows llama.cpp to be treated like a transformers model, giving it access to the exact same samplers as models in that library. The text generation is currently functional but slow.
It works by making the forward call inside the wrapper like this:
I create a
self.cache
variable to be able to tell whether I need to callreset()
on the llama.cpp model or not based on the provided input ids.The next step would be to use this wrapper for perplexity evaluation. This would make a direct comparison against transformers or AutoGPTQ possible. The problem is that the call
returns a tensor with shape
torch.Size([1, 1, 32000])
, while my existing evaluation code, as well as a few alternative implementations that I have tried, always expect the forward call to return a tensor with shapetorch.Size([1, 1200, 32000])
, where the second number is the context size being used for the evaluation.Can anyone see an obvious solution to this?
Beta Was this translation helpful? Give feedback.
All reactions