Constant timeouts after multiple calls with async #769

Inkorak · 2023-11-10T09:19:15Z

Confirm this is an issue with the Python library and not an underlying OpenAI API

This is an issue with the Python library

Describe the bug

Constant timeouts after multiple asynchronous calls. It was discovered when using the Llama_Index framework that when calls are made to this library through the openai-python client wrapped with async, constant timeouts begin. If you do this without async or with asynchrony, but on the old version like 0.28, then there are no problems.

To Reproduce

Several calls in a row, for example, to embeddings that are wrapped with asynс.

Code snippets

No response

OS

ubuntu

Python version

Python 3.11.4

Library version

v1.2.0 and newer

RobertCraigie · 2023-11-10T14:54:34Z

Hi @Inkorak, I can't reproduce the issue you're seeing. Can you share a code snippet?

This snippet passes for me:

import anyio
from openai import AsyncOpenAI

client = AsyncOpenAI()

async def main() -> None:
    for _ in range(10):
        await client.embeddings.create(input="Hello world!", model="text-embedding-ada-002")

anyio.run(main)

ashwinsr · 2023-11-11T21:10:50Z

I can confirm this issue is affecting us as well. We recently upgrade from 0.28 to 1.2.3 and 12 hours later the timeouts began.

RobertCraigie · 2023-11-11T21:13:20Z

@ashwinsr can you share any more details?

Is this only happening when the client has been in use for a prolonged period of time?

ashwinsr · 2023-11-11T21:16:37Z

I'm trying really hard to build a minimal failing example, but I haven't gotten one yet. Basically we have a FastAPI server that uses the Async OpenAI client with streaming responses. After a while of running, the vast majority of calls to await client.chat.completions.create will give us timeouts.

We are currently on 1.2.3.

Any suggestions on what we can do to troubleshoot this / help you fix? This is a P0 for us right now.

RobertCraigie · 2023-11-11T21:17:59Z

Are you seeing connection pool timeouts or is it a request timeout?

ashwinsr · 2023-11-11T21:19:13Z

We are seeing pool timeouts and some request timeouts. Give me a second and I'll pull some more specific logs for you.

RobertCraigie · 2023-11-11T21:23:32Z

Okay, there was a bug reported recently with streaming responses not being closed correctly. But I did manage to reproduce that and push a fix so I'm surprised you're still seeing connection pool timeouts: #763

Do you have a lot of concurrent requests happening at once?

ashwinsr · 2023-11-11T21:24:28Z

@RobertCraigie We are seeing

Some httpcore.PoolTimeout's
Some regular timeouts (although this is possibly FastAPI timing out on waiting on OpenAI, but apologies something just logged the timeout error, we will improve our logging herre)

Thoughts?

ashwinsr · 2023-11-11T21:25:02Z

Not that many concurrent requests (think <20 at a time)

ashwinsr · 2023-11-11T21:26:17Z

Here's one traceback:

File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1299, in _request
response = await self._client.send(request, auth=self.custom_auth, stream=stream)
File "/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/httpx.py", line 137, in send
rv = await real_send(self, request, **kwargs)
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1620, in send
response = await self._send_handling_auth(
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
response = await self._send_handling_redirects(
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
response = await self._send_single_request(request)
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1722, in _send_single_request
response = await transport.handle_async_request(request)
File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 352, in handle_async_request
with map_httpcore_exceptions():
File "/usr/local/lib/python3.10/contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
raise mapped_exc(message) from exc

RobertCraigie · 2023-11-11T21:28:52Z

Okay thanks, do you have debug logging enabled?

If you could share debug logs for openai, httpx & httpcore it would be incredibly helpful.

ashwinsr · 2023-11-11T21:28:59Z

Also some regular timeout errors:

File "open_ai.py", line 111, in get_function_chat_completion
response = await client.chat.completions.create(
File "/usr/local/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1191, in create
return await self._post(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1480, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1275, in request
return await self._request(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1331, in _request
return await self._retry_request(options, cast_to, retries, stream=stream, stream_cls=stream_cls)
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1362, in _retry_request
return await self._request(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1331, in _request
return await self._retry_request(options, cast_to, retries, stream=stream, stream_cls=stream_cls)
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1362, in _retry_request
return await self._request(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1332, in _request
raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.

ashwinsr · 2023-11-11T21:42:02Z

@RobertCraigie unfortunately don't have debugging logging enabled already, and turning it on now might not help much because we're likely going to have to downgrade to the old API version until we can get this figured out (because we can't just wait for our production traffic to fail...)

RobertCraigie · 2023-11-11T21:54:05Z

@ashwinsr okay no worries, I would suggest trying to explicitly close stream responses (see issue linked earlier for an example) if you can before downgrading. I'll try to figure out what's happening.

ashwinsr · 2023-11-11T21:57:02Z

Got it, I'll do that. What would the code snippet be to close the connection for your embedding example at the top of the page?

RobertCraigie · 2023-11-11T22:05:00Z

Unfortunately you'd likely have to update your code to use raw responses, https://github.com/openai/openai-python?tab=readme-ov-file#accessing-raw-response-data-eg-headers. I would be very surprised if standard requests are a cause of this issue and it would help narrow this down if you left them as-is for now but I totally understand if you'd rather explicitly close responses there as well.

Also just to be clear, you definitely shouldn't have to explicitly close responses, I just suggested it as a temporary workaround so you don't have to downgrade.

ashwinsr · 2023-11-12T01:23:17Z

Alright @RobertCraigie we turned on debug logging, and left as is. I'll update here with the next set of failed logs to see if we can find the root cause.

Liu-Da · 2023-11-12T16:03:13Z

same situation

wistanch · 2023-11-12T16:06:41Z

Running into the same issue, had to swap the library out for direct calls to OpenAI API with aiohttp.

RobertCraigie · 2023-11-12T22:05:26Z

Thank you all for the additional confirmations, can you share any more details about your setup? The following would be most helpful:

Python version
Framework / app setup (e.g. FastAPI, Flask, etc.)
openai-python version
httpx version
httpcore version

If anyone can share a reproduction that would also be incredibly helpful.

RobertCraigie · 2023-11-12T22:20:41Z

Also any examples of code using the openai package would be helpful.

RobertCraigie · 2023-11-12T23:31:12Z

Additionally, we did recently fix a bug related to this so please ensure you're on the latest version! v1.2.3

RobertCraigie · 2023-11-13T09:12:54Z

Update: we've been able to reproduce the httpx.ReadTimeout issue but I have not been able to reproduce the pool timeout issue.

I have been able to reproduce the httpx.ReadTimeout issue while making raw requests using httpx directly so this may not be an issue with the SDK itself.

This issue may be related: encode/httpx#1171

The underlying error I get is this:

Traceback (most recent call last):
  File "/Users/robert/stainless/stainless/dist/openai-python/.venv/lib/python3.9/site-packages/anyio/streams/tls.py", line 131, in _call_sslobject_method
    result = func(*args)
  File "/Users/robert/.rye/py/cpython@3.9.18/install/lib/python3.9/ssl.py", line 889, in read
    v = self._sslobj.read(len)
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:2633)

RobertCraigie · 2023-11-13T18:17:09Z

I've pushed a fix for the httpx.ReadTimeout issue I managed to reproduce, this will be included in the next release: #804

RobertCraigie · 2023-11-13T18:27:36Z

A fix has been released in v1.2.4! Please let us know if that fixes the issue for you.

agronholm · 2023-11-20T18:09:28Z

Why are you involving threads there if you're already using async?

makaralaszlo · 2023-11-20T18:33:35Z

This is from an example of an extensive monolith application, where async functions are used for the I/O tasks, and there are CPU-heavy loads that are not necessary to be waited by the async tasks; these are sent out to a new thread. The same occurs if we use the run_in_executor() function of asyncio.

Using OpenAI 0.28.1, it worked perfectly before.

agronholm · 2023-11-20T18:50:24Z

The biggest problem with stucking_example.py is that it's creating multiple event loops. I cannot fathom how it ever worked before, but that must've been by chance. I suggest you use the synchronous API instead.

makaralaszlo · 2023-11-20T19:15:49Z

Yeah, you are right, @agronholm, so the main problem in our case is not the library-based problem, just it used to work before, but I think it is possible for others in this messaging also to make such a mistake, so for them, it would be nice to check if the same conditions apply.

In the example, only one event loop will be created for a different thread. These functions are executed on separate threads using the AsyncThreadingHelper class. If both threads try to access the same instance of OpenAIAdapter concurrently, it can result in race conditions. That's the problem in our case, and it can be solved using separate instances of OpenAIAdapter for each thread or creating a thread-safe mechanism to synchronize access to the shared resource, as you mentioned. Or to create separate instances of the OpenAIAdapter.

@Inkorak, if you are referring to this library https://github.com/run-llama/llama_index there might be the same problem it utilizes the async and threading at the same time. (I didn't deep dive into the code, but it worth checking it)

rattrayalex · 2023-11-21T01:25:20Z

@makaralaszlo does your code work if you use the synchronous OpenAI client instead of AsyncOpenAI? If so, it's unlikely to be the same problem as this issue.

zhu · 2023-11-21T05:50:14Z

Fix in v1.2.3 only works when no exception raise.
You should catch exceptions and call self.response.aclose() to release the connection.

zhu · 2023-11-22T07:00:51Z

And upgrade the anyio to v4.0.0 may help.

agronholm · 2023-11-22T09:14:35Z

And upgrade the anyio to v4.0.0 may help.

With what, exactly?

zhu · 2023-11-23T01:49:08Z

And upgrade the anyio to v4.0.0 may help.

With what, exactly?

agronholm/anyio#534

Seems some bug is fixed in 4.0.0

billwang233 · 2023-11-23T07:02:07Z

I used openai==1.3.5 to execute model gpt-3.5-turbo-0613 and new model gpt-3.5-turbo-1106 in parallel.

But, I got different test results!!!

I set the parallelism to 20. Model gpt-3.5-turbo-1106 may occur several times of read timeout, while Model gpt-3.5-turbo-0613 does not.

Even though I set the parallelism to 50, Model gpt-3.5-turbo-0613 still hasn't occurred any times of read timeout.

My guess is it a problem with Model gpt-3.5-turbo-1106?

billwang233 · 2023-11-23T07:03:58Z

By the way, I test with SyncClient.

agronholm · 2023-11-23T10:15:13Z

And upgrade the anyio to v4.0.0 may help.

With what, exactly?

agronholm/anyio#534

Seems some bug is fixed in 4.0.0

Unless I'm badly mistaken, none of the fixes in v4.0.0 have any bearing on this particular issue. As for @makaralaszlo 's example, they were having trouble due to using the async client wrong, and upgrading AnyIO will not change that.

zhu · 2023-11-24T07:35:55Z

And upgrade the anyio to v4.0.0 may help.

With what, exactly?

agronholm/anyio#534
Seems some bug is fixed in 4.0.0

Unless I'm badly mistaken, none of the fixes in v4.0.0 have any bearing on this particular issue. As for @makaralaszlo 's example, they were having trouble due to using the async client wrong, and upgrading AnyIO will not change that.

You're right. I can reproduce this bug in anyio 4.0.0.
I found that gpt-3.5-turbo response for some prompt generate one chunk cost more than 150s. Retry cannot fix it.

krrishdholakia · 2023-11-30T03:24:05Z

did retrying the request fix it for anyone? I'm running a load test and seeing this issue.

Script: https://github.com/BerriAI/litellm/blob/8c1a9f1c4eeba21cd535e45cf8c7600b98635fce/litellm/tests/test_profiling_router.py#L4

linchpinlin · 2023-12-02T10:41:00Z

Same question.
I am using AsyncAzureOpenAI. the rough logic is: pd.Series.map(lambda x: asyncio.run(gpt_4(x))).
The gpt_4 function will call the api of the GPT4 model about 3-6 times.
After the first row of Series is processed successfully, the first call to the GPT4 api from the second row reports a Connection Error. Because I have a retry mechanism, but experiments show that after experiencing a Connection Error, all subsequent requests time out and none of them succeed. In the azure control panel (metrics) it also shows no successful requests.
When I switched to AzureOpenAI, all the problems disappeared.
openai==1.3.6
Single process

rattrayalex · 2023-12-04T02:35:12Z

I'm pleased to report this bug has been fixed in the API! Connections should no longer time out while waiting to send response headers. 🎉

Anyone who has downgraded to v0.28.1 for this reason should be able to upgrade back to the latest version.

antont · 2023-12-04T05:45:04Z

I'm pleased to report this bug has been fixed in the API! Connections should no longer time out while waiting to send response headers. 🎉

Is it this commit? 7aad340

RobertCraigie · 2023-12-04T09:16:30Z

@antont no, this bug was an API level issue and the OpenAI team managed to figure out the underlying cause in their server.

Users reporting that downgrading the SDK version fixes the issue was a red herring, we were able to reproduce the issue in a myriad of different situations, using aiohttp (what the v0 SDK uses), using anyio directly instead of httpx, using separate languages like Rust & Node.js etc

That commit does fix a separate bug where if we retried requests then we never closed the original request which leads to a memory leak and eventually makes the client unusable as the connection limit is reached.

jamesev15 · 2023-12-04T20:35:04Z

@RobertCraigie, I understand that the error occurred on openai servers, which means I have to contact azure to apply the same fix on their servers?. I have a gpt deployment on my azure subscription.

matteo-giacomazzi · 2023-12-13T18:58:27Z

Hello,

I'm experiencing this strange behavior when I use AsyncOpenAI:

2023-12-13 19:52:40 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
2023-12-13 19:52:41 - DEBUG - receive_response_headers.failed exception=CancelledError()

It appears when I call the completions.create method, sometimes at the very first call, sometimes on the subsequent ones.
This raises a couple of issues:

the request has aborted
no exception is thrown, the program keeps hanging waiting for a create method that never returns.

I've just upgraded all the packages I have so there should be no available fix that I'm missing, is it somehow related to the problem in this thread?

Thank you,
Matteo

rattrayalex · 2023-12-15T02:27:57Z

@matteo-giacomazzi that sounds like an unrelated issue and should be addressed separately (though my suspicion is that you may be closing the client, or the response, unintentionally).

matteo-giacomazzi · 2023-12-18T11:19:02Z

@matteo-giacomazzi that sounds like an unrelated issue and should be addressed separately (though my suspicion is that you may be closing the client, or the response, unintentionally).

I think you're right because the problem appears only when I use the API together with MattermostDriver (the idea is to have the bot accessible via Mattermost) so I guess the source of the problem comes from there as I'm unable to reproduce the behavior on a dedicated process that doesn't use any other asyncio features.

Thank you!

Inkorak added the bug Something isn't working label Nov 10, 2023

This comment has been minimized.

Sign in to view

GCODIN mentioned this issue Nov 11, 2023

Are you seeing connection pool timeouts or is it a request timeout? GCODIN/Generative#2

Open

RobertCraigie closed this as completed Nov 13, 2023

RobertCraigie mentioned this issue Nov 13, 2023

Improve behaviour for read timeouts #809

Open

1 task

krrishdholakia mentioned this issue Nov 30, 2023

[Feature]: handle httpx raising asyncio.CancelledError BerriAI/litellm#957

Closed

rattrayalex closed this as completed Dec 4, 2023

rattrayalex unpinned this issue Dec 4, 2023

rlouf mentioned this issue Dec 14, 2023

openai models do not complete any further requests after throwing an APITimeoutError dottxt-ai/outlines#428

Closed

AshuBik mentioned this issue Dec 18, 2023

Async API server starts throwing errors and works fine after restart #971

Closed

1 task

krrishdholakia mentioned this issue Dec 30, 2023

[Bug]: async timeout error on proxy BerriAI/litellm#1278

Closed

cenedella mentioned this issue Feb 26, 2024

Multiple Async calls to the api fail catastrophically #1195

Closed

1 task

tilmanbeck mentioned this issue Apr 19, 2024

vLLM ignores my requests when I increase the number of concurrent requests vllm-project/vllm#2752

Closed

Constant timeouts after multiple calls with async #769

Constant timeouts after multiple calls with async #769

Comments

Inkorak commented Nov 10, 2023 • edited Loading

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

RobertCraigie commented Nov 10, 2023

ashwinsr commented Nov 11, 2023

RobertCraigie commented Nov 11, 2023

ashwinsr commented Nov 11, 2023 • edited Loading

RobertCraigie commented Nov 11, 2023 • edited Loading

ashwinsr commented Nov 11, 2023

This comment has been minimized.

RobertCraigie commented Nov 11, 2023

ashwinsr commented Nov 11, 2023

ashwinsr commented Nov 11, 2023

ashwinsr commented Nov 11, 2023

RobertCraigie commented Nov 11, 2023

ashwinsr commented Nov 11, 2023

ashwinsr commented Nov 11, 2023

RobertCraigie commented Nov 11, 2023

ashwinsr commented Nov 11, 2023

RobertCraigie commented Nov 11, 2023 • edited Loading

ashwinsr commented Nov 12, 2023

Liu-Da commented Nov 12, 2023

wistanch commented Nov 12, 2023

RobertCraigie commented Nov 12, 2023 • edited Loading

RobertCraigie commented Nov 12, 2023

RobertCraigie commented Nov 12, 2023

RobertCraigie commented Nov 13, 2023 • edited Loading

RobertCraigie commented Nov 13, 2023

RobertCraigie commented Nov 13, 2023

agronholm commented Nov 20, 2023

makaralaszlo commented Nov 20, 2023 • edited Loading

agronholm commented Nov 20, 2023

makaralaszlo commented Nov 20, 2023 • edited Loading

rattrayalex commented Nov 21, 2023

zhu commented Nov 21, 2023

zhu commented Nov 22, 2023

agronholm commented Nov 22, 2023

zhu commented Nov 23, 2023

billwang233 commented Nov 23, 2023

billwang233 commented Nov 23, 2023

agronholm commented Nov 23, 2023

zhu commented Nov 24, 2023

krrishdholakia commented Nov 30, 2023 • edited Loading

linchpinlin commented Dec 2, 2023

rattrayalex commented Dec 4, 2023

antont commented Dec 4, 2023 • edited Loading

RobertCraigie commented Dec 4, 2023

jamesev15 commented Dec 4, 2023

matteo-giacomazzi commented Dec 13, 2023

rattrayalex commented Dec 15, 2023

matteo-giacomazzi commented Dec 18, 2023

Inkorak commented Nov 10, 2023 •

edited

Loading

ashwinsr commented Nov 11, 2023 •

edited

Loading

RobertCraigie commented Nov 11, 2023 •

edited

Loading

RobertCraigie commented Nov 11, 2023 •

edited

Loading

RobertCraigie commented Nov 12, 2023 •

edited

Loading

RobertCraigie commented Nov 13, 2023 •

edited

Loading

makaralaszlo commented Nov 20, 2023 •

edited

Loading

makaralaszlo commented Nov 20, 2023 •

edited

Loading

krrishdholakia commented Nov 30, 2023 •

edited

Loading

antont commented Dec 4, 2023 •

edited

Loading