-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge upstream prometheus/prometheus at 37b408c
#639
Conversation
Signed-off-by: Mickael Carl <mcarl@apple.com>
Signed-off-by: Mickael Carl <mcarl@apple.com>
…d for transient file writes. use it in loadDataAsQueryable to make sure the RO Head doesn't truncate or cut new chunks in data/chunks_head/. add a -sandbox-dir-root flag to "promtool tsdb dump/dump-openmetrics" to control the root of that sandbox dirrectory. Signed-off-by: machine424 <ayoubmrini424@gmail.com>
…e discovery doc Signed-off-by: Jiekun <zhujiekun@52tt.com>
Allow users to opt-out of the multi-cluster setup for Prometheus dashboard, in environments where it isn't applicable. Refer: prometheus/prometheus#13180. Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
This adds some more test cases for unicode values, and also a benchmark for zeroOrOneCharacterStringMatcher.Matches() Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
This replaces the custom `moreThanOneRune` function with the standard `utf8.DecodeRuneInString(s)` that can be used to figure out the size of the first rune. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
…to promql testing framework Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
On Windows, Go will sleep 15ms if you ask for less. TestAsyncRuleEvaluation compares actual delay to the nominal time, so using 15ms should work better on Windows, and be hardly noticeable elsewhere. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
Enable perfsprint linter and fix up code issues
bugfix: allow opting-out of multi-cluster setups
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
…gMatcher-Matches Improve `zeroOrOneCharacterStringMatcher` by using `utf8.DecodeRuneInString`
tsdb/index: Refactor Reader tests Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>
[TEST] Rules: Sleep 15ms to fit Windows behaviour better
docs: [ovh sd] Added missing label for OVH dedicated server in SD
…cordingly. Signed-off-by: kushagra Shukla <kushalshukla110@gmail.com> Signed-off-by: kushagra Shukla <kushalshukla110@gmail.com>
defaultStatsRenderer->DefaultStatsRenderer Add short docstrings. I'd like to use the stats renderer to peek at the statistics in grafana/mimir#7966 However I cannot call the original function without this export afterwards. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Converts existing native histogram unit tests to the PromQL testing framework
Follow up on prometheus/prometheus#14096 As promised, I bring a benchmark, which shows a very small improvement if context is checked every 128 iterations of label instead of every 100. It's much easier for a computer to check modulo 128 than modulo 100. This is a very small 0-2% improvement but I'd say this is one of the hottest paths of the app so this is still relevant. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Make DBReadOnly more RO
This came out of a major incident, so it is quite important, but also it has already been merged here as #633. |
…alse positives. Signed-off-by: gotjosh <josue.abreu@gmail.com>
model/rulefmt/rulefmt.go
Outdated
AlignEvaluationTimeOnInterval bool `yaml:"align_evaluation_time_on_interval,omitempty"` | ||
} | ||
|
||
func (g *RuleGroup) UnmarshalYAML(value *yaml.Node) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking for opinions here - is this enough to produce a "migration"? Users only write YAML directly to object storage before the ruler picks it up when encoding/decoding, it'll always show you the equivalent value of evaluation_delay
as rule_offset
.
Perhaps, not any sort of modification and just supporting both values (whilst having preference for rule_offset
is an approach we want?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way you did it, but I'd fail if both fields are found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as for being backwards compatible I think it's enough.
Maybe just always set them to the same value so that the code can use either (I haven't reviewed the rest of this PR); this also unblocks rollbacks from newer ruler to older ruler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just always set them to the same value so that the code can use either
I agree on this. Looks safer, even if once we'll upgrade mimir-prometheus in mimir we'll remove the usage of EvaluationDelay
(but better safe than sorry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the change to preserve both in a8dec7c
I like the way you did it, but I'd fail if both fields are found.
I think failing on both sounds sensible but seems more of a Mimir validation than a Prometheus one. We do have a function to validate rule groups for Mimir at an API level and I think this validation belongs there: https://github.com/grafana/mimir/blob/main/pkg/mimirtool/rules/rules.go#L255
Maybe just always set them to the same value so that the code can use either (I haven't reviewed the rest of this PR); this also unblocks rollbacks from newer ruler to older ruler.
I'm not sure you meant set them always to the same value if neither of them exist which is the alternative I had in mind but this is what I did:
- If either
evaluation_delay
orquery_offset
is set, set the other to the same value. - If both of them have values (technically not possible with the validation in Mimir) output them as is
- In mimir, we'll only use
query_offset
Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there would be two typical use cases:
- Always load rules from code. In which case the user will set the delay and we correctly unmarshal into the offset now.
- GET the rule from API, update and POST. The offset may or may not be returned on the POST, but both should work as expected I think
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If both of them have values (technically not possible with the validation in Mimir) output them as is
this is the bit I had in mind - if they conflict, then just choose one and set both to it. It's more an argument in principle, than solving for a specific problem right now. A group with different offset and delay is an invalid group. Not feeling strongly about it and I don't need to look at it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps there can be some confusion if
- the tenant gets a rule group which has both evalution_delay and query_offset
- They set only evaluation_delay in the json/yaml we returned.
- post back the same json without removing query_offset
if the client only pays attention to evaluation_delay, they won't ever understand why the delay is not being changed in Mimir. If we set them both to the same value, then the client would at least notice that the changes aren't reflected in Mimir.
Somewhat contrived, but I imagine that's how some scripting languages clients might work. Don't want to block this further, sorry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Mimir, we have to make it work with older clients not supporting query_offset
. For this reason, we can't stop exporting evaluation_delay
, but we also want to export query_offset
. I want a client to be able to successfully upload and exported config. For this reason, I'm not convinced the following is an option:
I like the way you did it, but I'd fail if both fields are found.
I think failing on both sounds sensible but seems more of a Mimir validation than a Prometheus one
I would keep it simple: export both fields, just fail the validation if they're both set but to different values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be some mistakes in index_test.go, please see comments.
- Don't change t to innerT - Bring back `t.Cleanup` with `reader.Close()` - Duplicate the createFileReader functions Signed-off-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my side.
Maybe you could get input from @bboreham on prometheus/prometheus#14079?
What is your question? |
@bboreham from the PR description:
|
Yes, put it in the changelog. |
model/rulefmt/rulefmt.go
Outdated
aux.QueryOffset = g.EvaluationDelay | ||
} | ||
|
||
aux.EvaluationDelay = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this break clients that expect eval delay to be there? I'm thinking about grafana specifically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old versions of mimirtool might also expect evaluation_delay to be set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Dimitar here. We need to keep exporting evaluation_delay
otherwise old clients break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good shout, I've made the change to preserve both in a8dec7c
model/rulefmt/rulefmt.go
Outdated
AlignEvaluationTimeOnInterval bool `yaml:"align_evaluation_time_on_interval,omitempty"` | ||
} | ||
|
||
func (g *RuleGroup) UnmarshalYAML(value *yaml.Node) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just always set them to the same value so that the code can use either
I agree on this. Looks safer, even if once we'll upgrade mimir-prometheus in mimir we'll remove the usage of EvaluationDelay
(but better safe than sorry).
model/rulefmt/rulefmt.go
Outdated
aux.QueryOffset = g.EvaluationDelay | ||
} | ||
|
||
aux.EvaluationDelay = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Dimitar here. We need to keep exporting evaluation_delay
otherwise old clients break.
func (f *mmapedFile) Close() error { | ||
err := f.m.Unmap() | ||
if err != nil { | ||
err = fmt.Errorf("mmapedFile: unmapping: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note : there was a fix to this later: prometheus/prometheus#13948
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually fix is further down to an existing code path, but line numbers didn't line up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should vendor this fix too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this PR merged first
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
…nce when reading the setting Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Thank you all for the help 🙏 |
Note to reviewers:
Please pay close attention to this diff, in particular to the following files:
The change that affected the
tsdb/index/index_test.go
andpromql/engine_test.go
is prometheus/prometheus#14071 -- I'll need an in-depth review of those two files and that PR as I modified to the best of my ability with what I understood.Relevant
zeroOrOneCharacterStringMatcher
by usingutf8.DecodeRuneInString
prometheus/prometheus#14090 (optimisation, not worth a changelog)Not Relevant