-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boto3 read timeout prevents long running #205
Comments
from botocore.client import Config |
This is an excellent bug! Thanks for bringing this to our attention @tlpriest - this is the kind of thing that we can only catch through real world usage. I've been doing my migrations locally to the remote DB (tsk tsk), so I didn't catch this one yet.
This seems like the correct solution to me, unless anybody can think of a scenario where having a lower value would be useful? |
If you've only been doing migrations locally rather than on lambda, you'll also need to send over the .py files because of a long-standing django issue where migrate does not use .pyc. |
Related to this, I just learned that AWS API GW has a 30 second timeout for Lambda integrations and this cannot be changed. However, if API GW disconnects from a Lambda function that runs longer than 30 seconds, it does not interrupt the Lambda function, which continues unmolested to completion in the background. |
I've been running long running stuff via sqs queues. using a keep warm to send an json object with params to run a queue worker method every so many minutes, then queue up parameters in SQS for processing. Gets around the GW 30 second timeout by not using it for those types of requests. If you need an endpoint to trigger, you just queue the work and return 200 to appease API GW. |
Added the proposed solution into latest code with a reference to this issue. |
@collingreen where exactly should we run this code?
|
from slack, about django zappa but the fixes suggested seem to belong upstream in zappa proper.
tlpriest [12:54 AM]
Has anyone run a significant django project on api gateway + lambda? I'm seeing some odd behavior just trying to complete a migration. 1 - It takes a very long time (4 minutes) with a t2.medium RDS server. Using docker postgres container localhost it takes about 1m20s. 2 - It appears to me that something is automatically resubmitting the request if it's not completed in a certain timeout (I'm trying to divine the timeout value from the logs now, but I'm seeing the original migrate and 3 restarts if the log streams are correct).
I have found the source of the resubmission issue I believe. python boto3 has a default read timeout of 60 seconds. I set the execution time on the lambda function to 5 minutes so migration could complete. So, when you invoke "./manage.py invoke ${env} migrate" and that does not complete in 60 seconds, it looks like there's a resubmission by the boto3 client to lamba, which starts a second, etc. migration in the middle of the first. (edited)
tlpriest [10:35 AM]
The fix for "manage.py invoke ${env} migrate" or any other long running process is:
config_dict = {'region_name': 'us-west-2', 'connect_timeout': 5, 'read_timeout': 300}
config = Config(**config_dict)
client = session.client('lambda', config=config)
[10:36]
You could get fancy and set the read_timeout to the duration configured for the target lambda function...
The text was updated successfully, but these errors were encountered: