-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constant timeouts after multiple calls with async #769
Comments
Hi @Inkorak, I can't reproduce the issue you're seeing. Can you share a code snippet? This snippet passes for me: import anyio
from openai import AsyncOpenAI
client = AsyncOpenAI()
async def main() -> None:
for _ in range(10):
await client.embeddings.create(input="Hello world!", model="text-embedding-ada-002")
anyio.run(main) |
I can confirm this issue is affecting us as well. We recently upgrade from 0.28 to 1.2.3 and 12 hours later the timeouts began. |
@ashwinsr can you share any more details? Is this only happening when the client has been in use for a prolonged period of time? |
I'm trying really hard to build a minimal failing example, but I haven't gotten one yet. Basically we have a FastAPI server that uses the Async OpenAI client with streaming responses. After a while of running, the vast majority of calls to await client.chat.completions.create will give us timeouts. We are currently on 1.2.3. Any suggestions on what we can do to troubleshoot this / help you fix? This is a P0 for us right now. |
Are you seeing connection pool timeouts or is it a request timeout? |
We are seeing pool timeouts and some request timeouts. Give me a second and I'll pull some more specific logs for you. |
This comment has been minimized.
This comment has been minimized.
Okay, there was a bug reported recently with streaming responses not being closed correctly. But I did manage to reproduce that and push a fix so I'm surprised you're still seeing connection pool timeouts: #763 Do you have a lot of concurrent requests happening at once? |
@RobertCraigie We are seeing
Thoughts? |
Not that many concurrent requests (think <20 at a time) |
Here's one traceback: File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1299, in _request |
Okay thanks, do you have debug logging enabled? If you could share debug logs for |
Also some regular timeout errors: File "open_ai.py", line 111, in get_function_chat_completion |
@RobertCraigie unfortunately don't have debugging logging enabled already, and turning it on now might not help much because we're likely going to have to downgrade to the old API version until we can get this figured out (because we can't just wait for our production traffic to fail...) |
@ashwinsr okay no worries, I would suggest trying to explicitly close stream responses (see issue linked earlier for an example) if you can before downgrading. I'll try to figure out what's happening. |
Got it, I'll do that. What would the code snippet be to close the connection for your embedding example at the top of the page? |
Unfortunately you'd likely have to update your code to use raw responses, https://github.com/openai/openai-python?tab=readme-ov-file#accessing-raw-response-data-eg-headers. I would be very surprised if standard requests are a cause of this issue and it would help narrow this down if you left them as-is for now but I totally understand if you'd rather explicitly close responses there as well. Also just to be clear, you definitely shouldn't have to explicitly close responses, I just suggested it as a temporary workaround so you don't have to downgrade. |
Alright @RobertCraigie we turned on debug logging, and left as is. I'll update here with the next set of failed logs to see if we can find the root cause. |
same situation |
Running into the same issue, had to swap the library out for direct calls to OpenAI API with aiohttp. |
Thank you all for the additional confirmations, can you share any more details about your setup? The following would be most helpful:
If anyone can share a reproduction that would also be incredibly helpful. |
Also any examples of code using the |
Additionally, we did recently fix a bug related to this so please ensure you're on the latest version! |
Update: we've been able to reproduce the I have been able to reproduce the This issue may be related: encode/httpx#1171 The underlying error I get is this:
|
I've pushed a fix for the |
A fix has been released in |
Why are you involving threads there if you're already using async? |
This is from an example of an extensive monolith application, where async functions are used for the I/O tasks, and there are CPU-heavy loads that are not necessary to be waited by the async tasks; these are sent out to a new thread. The same occurs if we use the run_in_executor() function of asyncio. Using OpenAI 0.28.1, it worked perfectly before. |
The biggest problem with |
Yeah, you are right, @agronholm, so the main problem in our case is not the library-based problem, just it used to work before, but I think it is possible for others in this messaging also to make such a mistake, so for them, it would be nice to check if the same conditions apply. In the example, only one event loop will be created for a different thread. These functions are executed on separate threads using the AsyncThreadingHelper class. If both threads try to access the same instance of OpenAIAdapter concurrently, it can result in race conditions. That's the problem in our case, and it can be solved using separate instances of OpenAIAdapter for each thread or creating a thread-safe mechanism to synchronize access to the shared resource, as you mentioned. Or to create separate instances of the OpenAIAdapter. @Inkorak, if you are referring to this library https://github.com/run-llama/llama_index there might be the same problem it utilizes the async and threading at the same time. (I didn't deep dive into the code, but it worth checking it) |
@makaralaszlo does your code work if you use the synchronous |
Fix in v1.2.3 only works when no exception raise. |
And upgrade the |
With what, exactly? |
Seems some bug is fixed in 4.0.0 |
I used But, I got different test results!!! I set the parallelism to 20. Model Even though I set the parallelism to 50, Model My guess is it a problem with Model |
By the way, I test with SyncClient. |
Unless I'm badly mistaken, none of the fixes in v4.0.0 have any bearing on this particular issue. As for @makaralaszlo 's example, they were having trouble due to using the async client wrong, and upgrading AnyIO will not change that. |
You're right. I can reproduce this bug in anyio 4.0.0. |
did retrying the request fix it for anyone? I'm running a load test and seeing this issue. |
Same question. |
I'm pleased to report this bug has been fixed in the API! Connections should no longer time out while waiting to send response headers. 🎉 Anyone who has downgraded to v0.28.1 for this reason should be able to upgrade back to the latest version. |
Is it this commit? 7aad340 |
@antont no, this bug was an API level issue and the OpenAI team managed to figure out the underlying cause in their server. Users reporting that downgrading the SDK version fixes the issue was a red herring, we were able to reproduce the issue in a myriad of different situations, using aiohttp (what the v0 SDK uses), using anyio directly instead of httpx, using separate languages like Rust & Node.js etc That commit does fix a separate bug where if we retried requests then we never closed the original request which leads to a memory leak and eventually makes the client unusable as the connection limit is reached. |
@RobertCraigie, I understand that the error occurred on openai servers, which means I have to contact azure to apply the same fix on their servers?. I have a gpt deployment on my azure subscription. |
Hello, I'm experiencing this strange behavior when I use
It appears when I call the
I've just upgraded all the packages I have so there should be no available fix that I'm missing, is it somehow related to the problem in this thread? Thank you, |
@matteo-giacomazzi that sounds like an unrelated issue and should be addressed separately (though my suspicion is that you may be closing the client, or the response, unintentionally). |
I think you're right because the problem appears only when I use the API together with MattermostDriver (the idea is to have the bot accessible via Mattermost) so I guess the source of the problem comes from there as I'm unable to reproduce the behavior on a dedicated process that doesn't use any other asyncio features. Thank you! |
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
Constant timeouts after multiple asynchronous calls. It was discovered when using the Llama_Index framework that when calls are made to this library through the openai-python client wrapped with async, constant timeouts begin. If you do this without async or with asynchrony, but on the old version like 0.28, then there are no problems.
To Reproduce
Several calls in a row, for example, to embeddings that are wrapped with asynс.
Code snippets
No response
OS
ubuntu
Python version
Python 3.11.4
Library version
v1.2.0 and newer
The text was updated successfully, but these errors were encountered: