-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support qwen2 #5037
support qwen2 #5037
Conversation
@simonJJJ is tiktoken tokenizer being used? last time I checked though llama.cpp supports Qwen, it seems not using tiktoken. |
I think it's the same as Qwen1 which use titoken as well. |
The open-sourced version would not use tiktoken. |
Could you take a look at this related issue #4331? |
@simonJJJ Can you help with debugging this? |
Can you provide HF repo with the model and a |
https://huggingface.co/SakuraLLM/Sakura-1B8-Qwen2beta-v0.9/tree/main
you can try it with something like |
This looks like a finetuned model - I need original Qwen2 models |
https://huggingface.co/Qwen/Qwen1.5-1.8B/tree/main |
Looks to be working on my side (ps. I don't know Chinese): make -j main && ./main -m ./models/qwen-1.8b-v1.5/ggml-model-f16.gguf -p "我相信生命的意义在于" -s 3 -ngl 99
...
我相信生命的意义在于创造。创造的途径有多种,但无论是哪种途径,都离不开两个要素,即:实践和创新。
我们今天要谈的话题是“如何正确看待学习中的困难”。在我们的日常生活中,每个人都会遇到大大小小的学习困难。有的同学可能会觉得学习上的困难对自己是一种巨大的考验,甚至认为自己没有能力战胜这些困难。其实,从某种意义上来说,任何人的能力都是有限的,不可能人人都能获得成功,但一个人的能力毕竟有限,要想取得好的成绩,必须要有一个良好的心态,要正确看待自己在学习中的不足和缺陷,在学习中遇到困难或问题时应如何面对,如何用积极乐观的态度面对自己的学习困难。只有这样,我们才能真正理解什么是“学无止境”,什么是“学海无涯”。 [end of text]
llama_print_timings: load time = 212.41 ms
llama_print_timings: sample time = 47.52 ms / 161 runs ( 0.30 ms per token, 3387.76 tokens per second)
llama_print_timings: prompt eval time = 29.97 ms / 4 tokens ( 7.49 ms per token, 133.45 tokens per second)
llama_print_timings: eval time = 1356.25 ms / 160 runs ( 8.48 ms per token, 117.97 tokens per second)
llama_print_timings: total time = 1491.44 ms / 164 tokens Used this repo: https://huggingface.co/Qwen/Qwen1.5-1.8B/tree/main Converted using this command: python3 convert-hf-to-gguf.py ~/Data/huggingface/Qwen1.5-1.8B/ --outfile models/qwen-1.8b-v1.5/ggml-model-f16.gguf --outtype f16 |
I think you need a longer prompt to trigger the bug. |
I think I found the problem here |
This PR adds the support of codes for the coming Qwen2 models. For information about Qwen, please visit https://github.com/QwenLM/Qwen. @ggerganov