-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache always updates clusters even if not needed anymore #38
Comments
I would like to get the opinion of @StanislawSwierc on that issue, as he contributed the cache functionality. |
Expected behavior describes how the cache was supposed to work. That's precisely why I used I've just investigated the behavior of the code and @shiosai is right! Call to I'll update tests to highlight this problem and prepare a fix. from cachetools import LRUCache, Cache
print("Access with 'id_to_cluster[1])' changes the order.")
id_to_cluster = LRUCache(maxsize=2)
id_to_cluster[1] = 1
id_to_cluster[2] = 2
print(id_to_cluster._LRUCache__order)
id_to_cluster[1]
print(id_to_cluster._LRUCache__order)
print()
print("Access with 'Cache.get(id_to_cluster, 1)' changes the order.")
id_to_cluster = LRUCache(maxsize=2)
id_to_cluster[1] = 1
id_to_cluster[2] = 2
print(id_to_cluster._LRUCache__order)
Cache.get(id_to_cluster, 1)
print(id_to_cluster._LRUCache__order)
print()
print("Access with 'Cache.__getitem__(id_to_cluster, 1)' does NOT change the order.")
id_to_cluster = LRUCache(maxsize=2)
id_to_cluster[1] = 1
id_to_cluster[2] = 2
print(id_to_cluster._LRUCache__order)
Cache.__getitem__(id_to_cluster, 1)
print(id_to_cluster._LRUCache__order)
|
@StanislawSwierc |
@StanislawSwierc Kindly reminding you that the fix you proposed would be appreciated, as I see that the memory efficiency feature is widely used. |
Thanks @davidohana for the reminder! I saw this issue shortly before going for vacation and it dropped it from the radar. I've just prepared a fix. It turned out to be slightly more complicated than replacing a single function call, but still manageable. |
@StanislawSwierc Thanks so much. Preparing a new version soon. |
During
fast_match
, drain always iterates over all possible clusters and updates their access time in the cache. This leads to two problems:Expected behavior:
Cluster will only be updated/touched in cache after they were actual used/chosen. There is actually a comment for this in the source code already:
Try to retrieve cluster from cache with bypassing eviction algorithm as we are only testing candidates for a match.
https://github.com/IBM/Drain3/blob/15470e391caed9a9ef5038cdd1dbd373bd2386a8/drain3/drain.py#L217
The text was updated successfully, but these errors were encountered: