-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hybridfs store type #36668
Add hybridfs store type #36668
Conversation
With this commit we introduce a new store type `hybridfs` that is a hybrid between `mmapfs` and `niofs`. This store type chooses different strategies to read Lucene files based on the read access pattern (random or linear) in order to optimize performance. This store type has been available in earlier versions of Elasticsearch as `default_fs`. We have chosen a different name now in order to convey the intent of the store type instead of tying it to the fact whether it is the default choice.
Pinging @elastic/es-distributed |
memory mapped. All other files are opened using Lucene `NIOFSDirectory`. | ||
Similarly to `mmapfs` be sure you have allowed plenty of | ||
<<vm-max-map-count,virtual address space>>. | ||
|
||
[[allow-mmapfs]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we should rename this setting in order to decouple it from the store type, e.g. node.store.allow_mmap
. The only complication is when we backport this to 6.x because changing a property name would be a breaking change (and that's the reason I did not rename it yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpountz and me discussed this and we'll open a separate PR to deprecate node.store.allow_mmapfs
and rename it to node.store.allow_mmap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have opened #37070.
I'm marking this PR as WIP for the time being putting it up for discussion and also to get CI expose early on. |
/* | ||
* We are mmapping norms, docvalues as well as term dictionaries, all other files are served through NIOFS | ||
* this provides good random access performance while not creating unnecessary mmaps for files like stored | ||
* fields etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's mention FS cache usage rather than the number of mmaps as a reason for this hybrid store type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just restored the original comment but that makes sense to me. I'll update the comment accordingly.
@@ -75,7 +79,7 @@ private void doTestStoreDirectory(Index index, Path tempDir, String typeSettingV | |||
break; | |||
case FS: | |||
if (Constants.JRE_IS_64BIT && MMapDirectory.UNMAP_SUPPORTED) { | |||
assertTrue(directory.toString(), directory instanceof MMapDirectory); | |||
assertTrue(directory.toString(), directory instanceof FileSwitchDirectory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the two above cases, maybe also check that directory.getPrimaryDir() is a MMapDirectory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an additional assertion.
@elasticmachine retest this please |
1 similar comment
@elasticmachine retest this please |
Thanks for the review! I'll open a separate PR for the backport to 6.x in which |
I've raised a separate PR for the backport to 6.x in #37067. |
With this commit we rename `node.store.allow_mmapfs` to `node.store.allow_mmap`. Previously this setting has controlled whether `mmapfs` could be used as a store type. With the introduction of `hybridfs` which also relies on memory-mapping, `node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename it in order to convey that it is actually used to allow memory-mapping but not a specific store type. Relates elastic#36668
With this commit we introduce a new store type `hybridfs` that is a hybrid between `mmapfs` and `niofs`. This store type chooses different strategies to read Lucene files based on the read access pattern (random or linear) in order to optimize performance. This store type has been available in earlier versions of Elasticsearch as `default_fs`. We have chosen a different name now in order to convey the intent of the store type instead of tying it to the fact whether it is the default choice. Relates #36668 Relates #37067
With this commit we rename `node.store.allow_mmapfs` to `node.store.allow_mmap`. Previously this setting has controlled whether `mmapfs` could be used as a store type. With the introduction of `hybridfs` which also relies on memory-mapping, `node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename it in order to convey that it is actually used to allow memory-mapping but not a specific store type. Relates #36668 Relates #37070
@danielmitterdorfer I think this implementation is broken with respect to managing pending deletes. I do have a fix for it. Just a heads-up for the backport. It will also fi #37111 |
@s1monw sorry to hear and thank you for the heads-up. |
We don't want two FSDirectories manage pending deletes separately and optimze file listing. This confuses IndexWriter and causes exceptions when files are deleted twice but are pending for deletion. This change move to using a NIOFS subclass that only delegates to MMAP for opening files all metadata and pending deletes are managed on top. Closes elastic#37111 Relates to elastic#36668
We don't want two FSDirectories manage pending deletes separately and optimize file listing. This confuses IndexWriter and causes exceptions when files are deleted twice but are pending for deletion. This change move to using a NIOFS subclass that only delegates to MMAP for opening files all metadata and pending deletes are managed on top. Closes #37111 Relates to #36668
We don't want two FSDirectories manage pending deletes separately and optimize file listing. This confuses IndexWriter and causes exceptions when files are deleted twice but are pending for deletion. This change move to using a NIOFS subclass that only delegates to MMAP for opening files all metadata and pending deletes are managed on top. Closes #37111 Relates to #36668
With this commit we switch the default store type from `hybridfs` to `mmapfs`. While `hybridfs` is beneficial for random access workloads (think: updates and queries) when the index size is much larger than the available page cache, it incurs a performance penalty on smaller indices that fit into the page cache (or are not much larger than that). This performance penalty shows not only for bulk updates or queries but also for bulk indexing (without *any* conflicts) when an external document id is provided by the client. For example, in the `geonames` benchmark this results in a throughput reduction of roughly 17% compared to `mmapfs`. This reduction is caused by document id lookups that show up as the top contributor in the profile when enabling `hybridfs`. Below is such an example stack trace as captured by async-profiler during a benchmarking trial where we can see that the overhead is caused by additional `read` system calls for document id lookups: ``` __GI_pread64 sun.nio.ch.FileDispatcherImpl.pread0 sun.nio.ch.FileDispatcherImpl.pread sun.nio.ch.IOUtil.readIntoNativeBuffer sun.nio.ch.IOUtil.read sun.nio.ch.FileChannelImpl.readInternal sun.nio.ch.FileChannelImpl.read org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal org.apache.lucene.store.BufferedIndexInput.refill org.apache.lucene.store.BufferedIndexInput.readByte org.apache.lucene.store.DataInput.readVInt org.apache.lucene.store.BufferedIndexInput.readVInt org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup.getDocID org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup. lookupVersion org.elasticsearch.common.lucene.uid.VersionsAndSeqNoResolver.loadDocIdAndVersion org.elasticsearch.index.engine.InternalEngine.resolveDocVersion org.elasticsearch.index.engine.InternalEngine.planIndexingAsPrimary org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation org.elasticsearch.index.engine.InternalEngine.index org.elasticsearch.index.shard.IndexShard.index org.elasticsearch.index.shard.IndexShard.applyIndexOperation org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary [...] ``` For these reasons we are restoring `mmapfs` as the default store type. Relates elastic#36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates elastic#36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates #36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates elastic#36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates elastic#36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates #36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates #36668
With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates #36668
Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.
Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.
Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.
Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.
Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.
With this commit we introduce a new store type
hybridfs
that is ahybrid between
mmapfs
andniofs
. This store type chooses differentstrategies to read Lucene files based on the read access pattern (random
or linear) in order to optimize performance.
This store type has been available in earlier versions of Elasticsearch
as
default_fs
. We have chosen a different name now in order to conveythe intent of the store type instead of tying it to the fact whether it
is the default choice.