-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support fastchat http bindings #421
Conversation
leiwen83
commented
Sep 10, 2023
- feat: add support fastchat http bindings
49c7ea7
to
18e1909
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, a few comments
18e1909
to
4075cd9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @leiwen83 - Could you fix the CI checks, once fixed. it should be automatically merged. |
Head branch was pushed to by a user without write access
1c730dc
to
a1907c1
Compare
* feat: add support fastchat http bindings Signed-off-by: Lei Wen <wenlei03@qiyi.com>
a1907c1
to
bd5d309
Compare
CI is fixed |
Hey @leiwen83 are you successful using FastChat with Tabby? If so, what model(s) are you using? I am trying to get it work and having some issues. Any advice you can lend is much appreciated. |
Hi, there is return format structure change in fastchat mainstream, so a patch is needed to work with current mainstream fastchat #670 You may try that fix patch first. |
Thank you both for your help to make this work! I will pull the latest and give this a try this evening. |
So far the response from FastChat has been too slow for Tabby. It seems like it's due too flooding the API with requests. We may need some denounce logic I the http device and/or request cancelation (if possible). I'm running wizard 3b on an a770 with 16gb VRAM Here is the compose file I am using with a local Tabby container built from the latest code on the main branch: |
Yes, I also notice this slowness. For denounce logic, do you have any idea? |
I worked on it for a while last night but it doesn't seem to be working. I have never developed in Rust. So, I am not really sure if this is even on the right track, but here is my attempt: |