You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been reading about the context-length and how it effects the performance for transformer based large language models. I have understood that the context-length is the maximum number of tokens the current token can attend to during attention. So that means the kv-cache has a maximum size that is equal to the context-length. When I set the context-length to 128 it segmentation faults in the decode stage of llama.cpp. I am trying to understand why this is happening, I believe that the context length should not be resulting in any sort of issues(theoretically), however maybe this is due to the way the model was trained or something on the architectural level that is causing this. Any articles to read about this or explanations of why this is happening would be greatly appreciated. I believe the error is being caused in the llama_decode_internal on line 17095(from when I cloned)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have been reading about the context-length and how it effects the performance for transformer based large language models. I have understood that the context-length is the maximum number of tokens the current token can attend to during attention. So that means the kv-cache has a maximum size that is equal to the context-length. When I set the context-length to 128 it segmentation faults in the decode stage of llama.cpp. I am trying to understand why this is happening, I believe that the context length should not be resulting in any sort of issues(theoretically), however maybe this is due to the way the model was trained or something on the architectural level that is causing this. Any articles to read about this or explanations of why this is happening would be greatly appreciated. I believe the error is being caused in the llama_decode_internal on line 17095(from when I cloned)
Beta Was this translation helpful? Give feedback.
All reactions