-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making the number of CPUs used for WAL replay configurable #4445
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style suggestions mostly, I think what's here makes sense. When Prometheus got updated it looks like they moved the tsdb.GenerateTestHistogram
function to tsdbutil
. which is causing the unit test and linting issues. Not sure if we have a mainline PR open to update Prometheus yet but I don't think we should be doing that here.
The referenced pull request #1040 seems unrelated to this change? |
d8bedb3
to
1be6b7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes:
- List the deprecated config option under
docs/sources/mimir/operators-guide/configure/about-versioning.md
- In the vendored Prometheus they've added
lookback_delta
parameter to the instant and range query API. To be 100% Prometheus compatible we should support it too. This is not expected to be done in this PR, but please open an issue in the Mimir repo about it and kick off the discussion whether we should add support for it (I think so) - I checked other vendored changes and LGTM
e70fa6a
to
789fbf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, very good. Thanks for addressing my previous comments! I left few last nits and then I will be more than happy to merge it 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, modulo a last nit :)
What this PR does
The version of
prometheus
is upgraded (viamimir-prometheus
) to the one corresponding to this commit, which introducedWALReplayConcurrency
as an option on tsdbOptions
andHeadOptions
. This option represents the maximal number of CPUs used during WAL replay on TSDB opening. If not set to a positive number, all available CPUs are used.WALReplayConcurrency
has been added as a tsdb option onmimir
side as well, and it is configurable via the newly added advanced configuration parameter-blocks-storage.tsdb.wal-replay-concurrency
, whose default value is 0. The latter is supposed to override the-blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup
, which is now deprecated, and will be removed fromMimir 2.10
. In order to guarantee backwards compatibility, the following logic is used:-blocks-storage.tsdb.wal-replay-concurrency
is set to a positive number, the deprecated-blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup
is ignored.-blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup
is used and the behavior is identical to the one before this PR.Setting
-blocks-storage.tsdb.wal-replay-concurrency
to a positive value imposes the maximal number of concurrent processes to be used on ingesters startup. The heuristic used is the following:10
tenants, opening of TSDBs is done by-blocks-storage.tsdb.wal-replay-concurrency
concurrent processes, each of them using only 1 process for WAL replay (i.e.,prometheus
optionWALReplayConcurrency
is set to1
).-blocks-storage.tsdb.wal-replay-concurrency
concurrent processes for WAL replay ((i.e.,prometheus
optionWALReplayConcurrency
is set to-blocks-storage.tsdb.wal-replay-concurrency
).Which issue(s) this PR fixes or relates to
Part of https://github.com/grafana/mimir-squad/issues/1040
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]