Replies: 4 comments
-
llama.cpp does support it. But the main code does not yet work with alibi. |
Beta Was this translation helpful? Give feedback.
0 replies
-
There are bounty($2000) for CPU inference support for Refact LLM: smallcloudai/refact#77 |
Beta Was this translation helpful? Give feedback.
0 replies
-
tracking issue: #3061 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Reddit announce: https://www.reddit.com/r/LocalLLaMA/comments/169yonh/we_trained_a_new_16b_parameters_code_model_that/
Blog: https://refact.ai/blog/2023/introducing-refact-code-llm/
Code: https://github.com/smallcloudai/refact/
Model: https://huggingface.co/smallcloudai/Refact-1_6B-fim
Do I understand correctly that this model cannot yet be used in llama.cpp since there is no support for Multi Query Attention yet?
Is this the only blocker?
Beta Was this translation helpful? Give feedback.
All reactions