You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.
qwen大模型目前用下来是国内非常好的模型,在qwen.cpp之前直接用HF的transformer效果有限,token速度慢尤其是最后一些tokens极慢。在cpp出来后有如神助,能充分发挥qwen模型(尤其是14b以上的模型),自己测下来比其他国内大模型好用。
qwen.cpp也有些bug,大家在issue中有提及,可惜团队目前不打算更新cpp了?
qwen cpp merge到llama cpp后,没有什么好的python binding。llama-cpp-python适配起来效果有限,而且同样的prompt,尤其在长context情况下,该binder输出很差,无法与qwen cpp比,目前我已放弃。
如果没有好的binder,会影响大家使用qwen大模型。希望团队能考虑继续支持qwen cpp!
The text was updated successfully, but these errors were encountered: