You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been debugging an issue in my app that was causing severe slowdowns in a django-ninja based API that exposes django-storages s3 urls for various fields.
Basically the ._connections.connection is being recreated quite frequently. This is super super slow because in boto3, initializing the connection "does things" with a ton of different aws services. Optimizing boto3 is out of my depth but I found that the real culprit is this:
I did some testing and making these thread-unsafe removes the issue. I wondered also if this were related to the setstate/getstate added in commit 82e18aa but I couldn't validate either way.
Now to be clear, this is being reset multiple times per request. I am not sure why this was not caught before. I'm using Django 5.2 -- Maybe Django's underlying threading structure changed in recent years with the addition of async support and it's surfacing now. But either way it's massively slowing down some queries.
I've managed to reproduce egregious instances of it with eg. an OrganizationMember model which has an avatar, and displaying 10 org members in a django-ninja api. This makes the class instanciate 10 times and each time took ~500ms on a t2.micro.
A super nasty bug. Gut feelings:
It's possible this is being slower than normal for some other reason, which would be why it's not been caught until now.
Regardless of whether it's reset multiple times per connection, this adds a crazy high amount of latency to each request.
It may be worth re-testing in recent versions of boto3 whether this issue still happens and, if it does, figuring out what makes it happen. I don't have my hopes up on fixing boto3 to be thread-safe but it could at least help surface whether the way the threadlocal state is being stored is overkill or not.
The text was updated successfully, but these errors were encountered:
I've been debugging an issue in my app that was causing severe slowdowns in a django-ninja based API that exposes django-storages s3 urls for various fields.
The bottleneck I found was here:
django-storages/storages/backends/s3.py
Line 463 in f029e50
Basically the ._connections.connection is being recreated quite frequently. This is super super slow because in boto3, initializing the connection "does things" with a ton of different aws services. Optimizing boto3 is out of my depth but I found that the real culprit is this:
django-storages/storages/backends/s3.py
Lines 332 to 333 in f029e50
Those threading.local() objects were kinda suspicious to my initially. I did some digging, they were added back in 2017, in commit 142e822 by @tomkins - the commit message nicely documents the reasoning, and the documentation still backs it up:
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html#multithreading-multiprocessing
I did some testing and making these thread-unsafe removes the issue. I wondered also if this were related to the setstate/getstate added in commit 82e18aa but I couldn't validate either way.
Now to be clear, this is being reset multiple times per request. I am not sure why this was not caught before. I'm using Django 5.2 -- Maybe Django's underlying threading structure changed in recent years with the addition of async support and it's surfacing now. But either way it's massively slowing down some queries.
I've managed to reproduce egregious instances of it with eg. an OrganizationMember model which has an
avatar
, and displaying 10 org members in a django-ninja api. This makes the class instanciate 10 times and each time took ~500ms on a t2.micro.A super nasty bug. Gut feelings:
The text was updated successfully, but these errors were encountered: