-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix ExpiringDict caching solution #1200
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my question below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @terrazoon!
Let's see if this helps alleviate some of the pain at least and buys us more time and space to figure out our next steps with all of this.
Description
For various reasons we are currently forced to use an in-memory cache to store job information that is used to populate the reports. Although we increased the cache time to 7 days and the number of objects to 20,000, there are still intermittent problems retrieving data being reported.
We found two separate issues:
Sometimes we cannot download a csv file (404) even though we know that csv was successfully uploaded and the message in the csv was successfully sent. This should not happen and the reason why it does is currently a mystery (seems to happen on Fridays!). We are going to ignore this issue for the moment.
Every time the app restarts, the in-memory cache is obviously wiped out, and it seems that the app restarts more frequently with more dire consequences for report generation than we anticipated. In addition to this, no in-memory solution can run across processes so we will need a separate cache for every gunicorn worker.
So the solution proposed here is run a 'repair' task periodically that checks the cache to make sure it has all jobs currently in the bucket.
NOTE: The 'best practice' here would be to use redis or memcached, but until or unless we decide to go that route, ExpiringDict where we rebuild it on a schedule seems like a workable solution. It potentially does require more memory (although how much memory is being used by one instance of the cache is unknown, and probably not much).
Security Considerations
N/A