-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Searchable Snapshot] Implement cache restore mechanism and track phantom files #5980
Comments
There are 3 alternative solutions which were considered -
The current solution as a part of #6538 is based on Approach 1. To compare between solutions, I ran the test code for all of the approaches with the same folder structure which contained 30 indices, each of which had 25 shards. Each of the shards had 200 files. (Roughly 1,50,000 files in the cache folder) The execution times turned out to be as follows -
For the purposes of cache loading, approach 1 seems to be the best given the overhead/tradeoffs with other approaches. I will also attach the code for references/future work. |
To create files -
|
Sequential solution - currently a part of #6538 Fork Join Pool Action -
|
ThreadPool based solution -
|
Also adding in the actual execution code -
|
Refereeing to #4964
Right now, FileCache capacity is computed arbitrary using the first data path x 50% (hard coded) of total size. Also, for now, Cached files are stored within the corresponding indices directories. Also, for now, when node restart it won't put the cached files into the cache but will ignore them (phantom files). Next step is to decide which cache scope we should use and where to store the cached data and construct the cache accordingly. Change Node bootstrap to construct the named cached based on the List, it should include the cached files walk through logic as well to restore them to cache or to delete them.
The text was updated successfully, but these errors were encountered: