-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory cache off-by-one error #396
Conversation
…o caused it to blow up if cache size == 0
@bnbalsamo thanks for the PR. Could you add a test for your fix? |
…nd actually remedied the issues with the RAM cache size == 0 case, as I thought I had in the previous commit. Added tests for these cases.
…here none of the info is cached in RAM. Still does two disk reads in all cases where RAM cache size > 0
loris/img_info.py
Outdated
@@ -372,8 +372,7 @@ def get(self, request): | |||
info_and_lastmod = (info, lastmod) | |||
logger.debug('Info for %s read from file system', request) | |||
# into mem: | |||
self._dict[request.url] = info_and_lastmod | |||
|
|||
self.__setitem__(request, info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setitem writes the content to the file system as well as memory, doesn't it? Don't we only want this information to go into memory in this get method?
Maybe we should split setitem up into two methods, one for writing to disk and one for putting in memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I see your reasoning here. Writing on every get()
would be a lot of writing to disk.
In this case I would be more inclined to tack a kwarg into __setitem__
for this purpose. The __setitem__
and __getitem__
dict-style interface is what is meant to be presented to "the outside world" from this class, correct?
If that is the case (and the cache is meant to be swappable in the wider code base by purpose-built implementations which expose the same interface), I would think that this case would be the only one where bootstrapping something into RAM that already exists on the filesystem would be required - all other cases would require being written both to disk and into RAM, both within the class and in calling code.
Sorry for multiple commits here - let me know if you'd prefer I squash them and re-open a PR. My initial assumption was incorrect, there wasn't an off by one error in the loop condition, but the case where RAM cache size == 0 was broken. To fix this the logic for RAM cache addition is now add to cache --> trim cache There was also a particularly pesky bypassing of the RAM cache size logic in the getter - that now calls The final commit (which occurred to me in the middle of writing this comment) minimizes disk reads in the event there is no RAM cache. There are also now tests for limited RAM cache sizes and no RAM cache. One potential byproduct of this change is that the previously a call to I'm not familiar enough with the entirety of the code base to guess whether or not someone relied on repeated calls to |
@bnbalsamo thanks. Merging. |
Current handling of the in memory cache in InfoCache has an off-by-one in the while loop condition. In cases where size > 0 the cache would (harmlessly, I believe) just be one smaller than stated in the size parameter. In cases where size == 0 we were getting
StopIteration
exceptions in production, though in my own testing I could only getKeyError: 'dictionary is empty'
This PR changes the while loop condition to strictly
>
, rather than>=