Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alternate implementation of index-header reader that does not use mmap #3667

Closed
wants to merge 107 commits into from

Conversation

56quarters
Copy link
Contributor

What this PR does

This PR adds an alternate implementation of the store gateway's index-header reader that does not use mmap. The motivations for this change and references are covered in #3465.

cc: @56quarters, @stevesg

Which issue(s) this PR fixes or relates to

#3465

Notes to reviewers

After review is finished, DO NOT MERGE as-is. @charleskorn and @56quarters will handle fixing up the squashed message.

The most important parts of this PR to examine:

  • When the experimental streaming index-header reader is disabled, everything should work exactly as it does today.
  • There should be nothing fundamentally wrong with this architecture that will prevent us from switching over to it entirely in the future.

We expect the performance of this implementation to not be equivalent to the existing mmap implementation at first. The goal of this PR is not to merge a perfect replacement that we'll immediately be able to switch to in production. Instead, we'd like to get things into main as they are so that we can iterate on performance, making small changes, based on further testing.

Changes

There's a lot of commits here and it doesn't make sense to review them individually. However, there's also a lot of code here. The organization of our changes is described below. Some knowledge of the current architecture of the store-gateway is assumed.

(note that all paths are relative to pkg/storegateway)

Encoding components

  • indexheader/encoding/reader.go/FileReader: this is the lowest level part of this change. This is a buffering wrapper around file operations that provides some convenience methods for seeking, reading, peeking.
  • indexheader/encoding/encoding.go/Decbuf: this is a copy / modification of the equivalent Prometheus file. It includes methods for reading integers, strings, and bytes from binary files using a FileReader instance. This is the important part: it uses standard go file operations that the goruntime knows how to schedule if they block. Compare this to mmap which causes invisible blocking operations that hang the entire store-gateway.
  • indexheader/encoding/factory.go/DecbufFactory: this code is responsible for creating new Decbuf instances in a variety of ways. It is also responsible for cleaning up after Decbuf instances are no longer needed. This is because it pools the underlying bufio.Readers used.

TSDB index components

  • indexheader/index/symbols.go/Symbols: this is a copy/modification of the equivalent Prometheus file. It includes a modified Symbol reader that works with our DecbufFactory (which in turn uses standard go file operations). It also includes a bulk API for reverse lookups since this is required by the StreamBinaryReader
  • indexheader/index/positings.go/PostingOffsetTable: an interface abstracting the differences between how index v1 postings are looked up vs index v2 postings. The existing BinaryReader used a bunch of if statements to pick how to work on postings. We feel this is cleaner. This file contains the v1 and v2 implementations of this interface using largely the same logic as the existing BinaryReader.

New index-header Reader

  • indexheader/stream_binary_reader.go/StreamBinaryReader: the alternate implementation of the Reader interface that uses all the previously described file-based logic for reading index-headers from disk. The interesting part of this class is the logic run when instances are first created: the index-header table-of-contents is loaded, symbols are loaded and validated, postings are loaded and validated.
  • indexheader/reader_pool.go/ReaderPool: there are some limited changes here to allow the StreamBinaryReader to be used when index-header Readers are lazy loaded or eagerly loaded.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@56quarters 56quarters changed the title 56quarters/indexheader mmap removal redux Add alternate implementation of index-header reader that does not use mmap Dec 7, 2022
stevesg and others added 29 commits December 7, 2022 09:01
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
fadvise is not available on other platforms (including Darwin).

fadvise is only used in benchmarks, so it's safe to ignore it on other
platforms (on the assumption that we'll only use results from Linux
hosts for the affected benchmarks).
This uncovered a small issue in Uvarint64(): it was not able to
decode values that encode to more than 8 bytes.
These didn't uncover any issues, so I'm not sure if there's value in
keeping them or not.
This is required by newBinaryTOCFromFile().
This is required by newBinaryTOCFromFile().
This matches the behaviour of the mmap Decbuf implementation.
The Decbuf instances are still backed by in-memory byte buffers
(using BufReader) rather than FileReaders.
56quarters and others added 27 commits December 7, 2022 09:03
Based on the benchmark, this improves the latency of LookupSymbol
of anywhere between 10% and 70%.
@56quarters 56quarters force-pushed the 56quarters/indexheader-mmap-removal-redux branch from 1a25bbd to dc3ecae Compare December 7, 2022 14:05
@56quarters 56quarters closed this Dec 7, 2022
@56quarters 56quarters deleted the 56quarters/indexheader-mmap-removal-redux branch December 7, 2022 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants