Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The health status of chat api server cannot be queried when the chat api server is generating responses. #29

Open
tjtanaa opened this issue Sep 4, 2024 · 1 comment
Labels
type: bug Something isn't working

Comments

@tjtanaa
Copy link
Contributor

tjtanaa commented Sep 4, 2024

Describe the bug

The health status of chat api server cannot be queried when the chat api server is generating responses.

@tjtanaa tjtanaa added the type: bug Something isn't working label Sep 4, 2024
@szeyu szeyu closed this as completed Sep 5, 2024
@szeyu szeyu reopened this Sep 5, 2024
@tjtanaa
Copy link
Contributor Author

tjtanaa commented Sep 25, 2024

The issue has been identified:
Since the server is launched using a single worker, this means that when the model is streaming or generating results, the worker will be occupied. Any incoming requests to the server will need to be queued until the one and only worker is free.

Welcome more suggestions by replying to this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants