Add hybridfs store type #36668

danielmitterdorfer · 2018-12-15T10:19:11Z

With this commit we introduce a new store type hybridfs that is a
hybrid between mmapfs and niofs. This store type chooses different
strategies to read Lucene files based on the read access pattern (random
or linear) in order to optimize performance.

This store type has been available in earlier versions of Elasticsearch
as default_fs. We have chosen a different name now in order to convey
the intent of the store type instead of tying it to the fact whether it
is the default choice.

With this commit we introduce a new store type `hybridfs` that is a hybrid between `mmapfs` and `niofs`. This store type chooses different strategies to read Lucene files based on the read access pattern (random or linear) in order to optimize performance. This store type has been available in earlier versions of Elasticsearch as `default_fs`. We have chosen a different name now in order to convey the intent of the store type instead of tying it to the fact whether it is the default choice.

elasticmachine · 2018-12-15T10:19:12Z

Pinging @elastic/es-distributed

danielmitterdorfer · 2018-12-15T10:21:17Z

docs/reference/index-modules/store.asciidoc

+memory mapped. All other files are opened using Lucene `NIOFSDirectory`.
+Similarly to `mmapfs` be sure you have allowed plenty of
+<<vm-max-map-count,virtual address space>>.
+
 [[allow-mmapfs]]


IMHO we should rename this setting in order to decouple it from the store type, e.g. node.store.allow_mmap. The only complication is when we backport this to 6.x because changing a property name would be a breaking change (and that's the reason I did not rename it yet).

@jpountz and me discussed this and we'll open a separate PR to deprecate node.store.allow_mmapfs and rename it to node.store.allow_mmap.

I have opened #37070.

danielmitterdorfer · 2018-12-15T10:36:38Z

I'm marking this PR as WIP for the time being putting it up for discussion and also to get CI expose early on.

jpountz · 2018-12-17T13:34:47Z

server/src/main/java/org/elasticsearch/index/store/FsDirectoryService.java

+    /*
+     * We are mmapping norms, docvalues as well as term dictionaries, all other files are served through NIOFS
+     * this provides good random access performance while not creating unnecessary mmaps for files like stored
+     * fields etc.


let's mention FS cache usage rather than the number of mmaps as a reason for this hybrid store type?

I just restored the original comment but that makes sense to me. I'll update the comment accordingly.

jpountz · 2018-12-17T14:09:20Z

server/src/test/java/org/elasticsearch/index/store/IndexStoreTests.java

@@ -75,7 +79,7 @@ private void doTestStoreDirectory(Index index, Path tempDir, String typeSettingV
                    break;
                case FS:
                    if (Constants.JRE_IS_64BIT && MMapDirectory.UNMAP_SUPPORTED) {
-                        assertTrue(directory.toString(), directory instanceof MMapDirectory);
+                        assertTrue(directory.toString(), directory instanceof FileSwitchDirectory);


in the two above cases, maybe also check that directory.getPrimaryDir() is a MMapDirectory?

I added an additional assertion.

danielmitterdorfer · 2018-12-18T07:39:08Z

@elasticmachine retest this please

danielmitterdorfer · 2018-12-18T12:09:02Z

@elasticmachine retest this please

danielmitterdorfer · 2019-01-02T09:10:05Z

Thanks for the review! I'll open a separate PR for the backport to 6.x in which hybridfs will be included as an option but will not be the default choice.

danielmitterdorfer · 2019-01-02T09:45:07Z

I've raised a separate PR for the backport to 6.x in #37067.

With this commit we rename `node.store.allow_mmapfs` to `node.store.allow_mmap`. Previously this setting has controlled whether `mmapfs` could be used as a store type. With the introduction of `hybridfs` which also relies on memory-mapping, `node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename it in order to convey that it is actually used to allow memory-mapping but not a specific store type. Relates elastic#36668

With this commit we introduce a new store type `hybridfs` that is a hybrid between `mmapfs` and `niofs`. This store type chooses different strategies to read Lucene files based on the read access pattern (random or linear) in order to optimize performance. This store type has been available in earlier versions of Elasticsearch as `default_fs`. We have chosen a different name now in order to convey the intent of the store type instead of tying it to the fact whether it is the default choice. Relates #36668 Relates #37067

With this commit we rename `node.store.allow_mmapfs` to `node.store.allow_mmap`. Previously this setting has controlled whether `mmapfs` could be used as a store type. With the introduction of `hybridfs` which also relies on memory-mapping, `node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename it in order to convey that it is actually used to allow memory-mapping but not a specific store type. Relates #36668 Relates #37070

s1monw · 2019-01-04T13:11:02Z

@danielmitterdorfer I think this implementation is broken with respect to managing pending deletes. I do have a fix for it. Just a heads-up for the backport. It will also fi #37111

danielmitterdorfer · 2019-01-04T13:16:52Z

@s1monw sorry to hear and thank you for the heads-up.

We don't want two FSDirectories manage pending deletes separately and optimze file listing. This confuses IndexWriter and causes exceptions when files are deleted twice but are pending for deletion. This change move to using a NIOFS subclass that only delegates to MMAP for opening files all metadata and pending deletes are managed on top. Closes elastic#37111 Relates to elastic#36668

We don't want two FSDirectories manage pending deletes separately and optimize file listing. This confuses IndexWriter and causes exceptions when files are deleted twice but are pending for deletion. This change move to using a NIOFS subclass that only delegates to MMAP for opening files all metadata and pending deletes are managed on top. Closes #37111 Relates to #36668

With this commit we switch the default store type from `hybridfs` to `mmapfs`. While `hybridfs` is beneficial for random access workloads (think: updates and queries) when the index size is much larger than the available page cache, it incurs a performance penalty on smaller indices that fit into the page cache (or are not much larger than that). This performance penalty shows not only for bulk updates or queries but also for bulk indexing (without *any* conflicts) when an external document id is provided by the client. For example, in the `geonames` benchmark this results in a throughput reduction of roughly 17% compared to `mmapfs`. This reduction is caused by document id lookups that show up as the top contributor in the profile when enabling `hybridfs`. Below is such an example stack trace as captured by async-profiler during a benchmarking trial where we can see that the overhead is caused by additional `read` system calls for document id lookups: ``` __GI_pread64 sun.nio.ch.FileDispatcherImpl.pread0 sun.nio.ch.FileDispatcherImpl.pread sun.nio.ch.IOUtil.readIntoNativeBuffer sun.nio.ch.IOUtil.read sun.nio.ch.FileChannelImpl.readInternal sun.nio.ch.FileChannelImpl.read org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal org.apache.lucene.store.BufferedIndexInput.refill org.apache.lucene.store.BufferedIndexInput.readByte org.apache.lucene.store.DataInput.readVInt org.apache.lucene.store.BufferedIndexInput.readVInt org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup.getDocID org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup. lookupVersion org.elasticsearch.common.lucene.uid.VersionsAndSeqNoResolver.loadDocIdAndVersion org.elasticsearch.index.engine.InternalEngine.resolveDocVersion org.elasticsearch.index.engine.InternalEngine.planIndexingAsPrimary org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation org.elasticsearch.index.engine.InternalEngine.index org.elasticsearch.index.shard.IndexShard.index org.elasticsearch.index.shard.IndexShard.applyIndexOperation org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary [...] ``` For these reasons we are restoring `mmapfs` as the default store type. Relates elastic#36668

With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates elastic#36668

With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates #36668

With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates elastic#36668

With this commit we add the `.cfs` file extension to the list of file types that are memory-mapped by hybridfs. `.cfs` files combine all files of a Lucene segment into a single file in order to save file handles. As this strategy is only used for "small" segments (less than 10% of the shard size), it is benefical to memory-map them instead of accessing them via NIO. Relates #36668

Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.

danielmitterdorfer added >enhancement WIP v7.0.0 :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels Dec 15, 2018

danielmitterdorfer commented Dec 15, 2018

View reviewed changes

Add hybridfs to IndexStoreTests

3b9d1b6

jpountz reviewed Dec 17, 2018

View reviewed changes

Address review comments

870b006

jpountz approved these changes Dec 18, 2018

View reviewed changes

Merge remote-tracking branch 'origin/master' into hybridfs

1145852

jpountz approved these changes Dec 21, 2018

View reviewed changes

danielmitterdorfer removed the WIP label Jan 2, 2019

danielmitterdorfer merged commit f0052b1 into elastic:master Jan 2, 2019

danielmitterdorfer mentioned this pull request Jan 2, 2019

[Backport] Add hybridfs store type #37067

Merged

danielmitterdorfer deleted the hybridfs branch January 2, 2019 10:30

danielmitterdorfer mentioned this pull request Jan 2, 2019

Rename setting to enable mmap #37070

Merged

romseygeek mentioned this pull request Jan 3, 2019

[CI] IndexShardIT.testShardHasMemoryBufferOnTranslogRecover failure #37111

Closed

s1monw mentioned this pull request Jan 4, 2019

Subclass NIOFSDirectory instead of using FileSwitchDirectory #37140

Merged

danielmitterdorfer mentioned this pull request Feb 1, 2019

Use mmapfs as default store type #38157

Closed

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

danielmitterdorfer mentioned this pull request Feb 15, 2019

Also mmap cfs files for hybridfs #38940

Merged

danielmitterdorfer mentioned this pull request Feb 15, 2019

Also mmap cfs files for hybridfs (#38940) #38947

Merged

danielmitterdorfer mentioned this pull request Feb 15, 2019

Also mmap cfs files for hybridfs (#38940) #38948

Merged

danielmitterdorfer mentioned this pull request Mar 19, 2019

Also mmap cfs files for hybridfs (#38940) #40189

Merged

jasontedor mentioned this pull request Apr 19, 2019

Avoid file cache trashing on Linux with mmapfs by using madvise() ? #27748

Closed

kovrus added a commit to crate/crate that referenced this pull request Sep 6, 2019

Add hybridfs store type.

6dd1dd8

Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.

kovrus added a commit to crate/crate that referenced this pull request Sep 6, 2019

Add hybridfs store type.

5c1ca11

Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.

kovrus mentioned this pull request Sep 6, 2019

Expose store.type table parameter and node.store.allow_mmap setting. crate/crate#9125

Merged

5 tasks

kovrus added a commit to crate/crate that referenced this pull request Sep 9, 2019

Add hybridfs store type.

c539c00

Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.

kovrus added a commit to crate/crate that referenced this pull request Sep 9, 2019

Add hybridfs store type.

27ac8b1

Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.

mergify bot pushed a commit to crate/crate that referenced this pull request Sep 10, 2019

Add hybridfs store type.

c15928d

Backport of elastic/elasticsearch#36668 This commit introduces a new store type hybridfs that is a hybrid between mmapfs and niofs.

jainankitk mentioned this pull request Jun 27, 2023

Support for disabling terms index off heap opensearch-project/OpenSearch#825

Closed

haveTryTwo mentioned this pull request Apr 10, 2023

support hybrid file type haveTryTwo/elasticsearch-6.3.2#5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hybridfs store type #36668

Add hybridfs store type #36668

danielmitterdorfer commented Dec 15, 2018

elasticmachine commented Dec 15, 2018

danielmitterdorfer Dec 15, 2018

danielmitterdorfer Jan 2, 2019

danielmitterdorfer Jan 2, 2019 •

edited

Loading

danielmitterdorfer commented Dec 15, 2018

jpountz Dec 17, 2018

danielmitterdorfer Dec 17, 2018

jpountz Dec 17, 2018

danielmitterdorfer Dec 17, 2018

danielmitterdorfer commented Dec 18, 2018

danielmitterdorfer commented Dec 18, 2018

danielmitterdorfer commented Jan 2, 2019

danielmitterdorfer commented Jan 2, 2019 •

edited

Loading

s1monw commented Jan 4, 2019

danielmitterdorfer commented Jan 4, 2019

Add hybridfs store type #36668

Add hybridfs store type #36668

Conversation

danielmitterdorfer commented Dec 15, 2018

elasticmachine commented Dec 15, 2018

danielmitterdorfer Dec 15, 2018

Choose a reason for hiding this comment

danielmitterdorfer Jan 2, 2019

Choose a reason for hiding this comment

danielmitterdorfer Jan 2, 2019 • edited Loading

Choose a reason for hiding this comment

danielmitterdorfer commented Dec 15, 2018

jpountz Dec 17, 2018

Choose a reason for hiding this comment

danielmitterdorfer Dec 17, 2018

Choose a reason for hiding this comment

jpountz Dec 17, 2018

Choose a reason for hiding this comment

danielmitterdorfer Dec 17, 2018

Choose a reason for hiding this comment

danielmitterdorfer commented Dec 18, 2018

danielmitterdorfer commented Dec 18, 2018

danielmitterdorfer commented Jan 2, 2019

danielmitterdorfer commented Jan 2, 2019 • edited Loading

s1monw commented Jan 4, 2019

danielmitterdorfer commented Jan 4, 2019

danielmitterdorfer Jan 2, 2019 •

edited

Loading

danielmitterdorfer commented Jan 2, 2019 •

edited

Loading