Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
With this commit we switch the default store type from `hybridfs` to `mmapfs`. While `hybridfs` is beneficial for random access workloads (think: updates and queries) when the index size is much larger than the available page cache, it incurs a performance penalty on smaller indices that fit into the page cache (or are not much larger than that). This performance penalty shows not only for bulk updates or queries but also for bulk indexing (without *any* conflicts) when an external document id is provided by the client. For example, in the `geonames` benchmark this results in a throughput reduction of roughly 17% compared to `mmapfs`. This reduction is caused by document id lookups that show up as the top contributor in the profile when enabling `hybridfs`. Below is such an example stack trace as captured by async-profiler during a benchmarking trial where we can see that the overhead is caused by additional `read` system calls for document id lookups: ``` __GI_pread64 sun.nio.ch.FileDispatcherImpl.pread0 sun.nio.ch.FileDispatcherImpl.pread sun.nio.ch.IOUtil.readIntoNativeBuffer sun.nio.ch.IOUtil.read sun.nio.ch.FileChannelImpl.readInternal sun.nio.ch.FileChannelImpl.read org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal org.apache.lucene.store.BufferedIndexInput.refill org.apache.lucene.store.BufferedIndexInput.readByte org.apache.lucene.store.DataInput.readVInt org.apache.lucene.store.BufferedIndexInput.readVInt org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup.getDocID org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup. lookupVersion org.elasticsearch.common.lucene.uid.VersionsAndSeqNoResolver.loadDocIdAndVersion org.elasticsearch.index.engine.InternalEngine.resolveDocVersion org.elasticsearch.index.engine.InternalEngine.planIndexingAsPrimary org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation org.elasticsearch.index.engine.InternalEngine.index org.elasticsearch.index.shard.IndexShard.index org.elasticsearch.index.shard.IndexShard.applyIndexOperation org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary [...] ``` For these reasons we are restoring `mmapfs` as the default store type. Relates elastic#36668
- Loading branch information