Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial query limit overriding #3090

Merged
merged 89 commits into from
Jan 21, 2021
Merged
Show file tree
Hide file tree
Changes from 84 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
9e3b280
Initial query limit overriding
rallen090 Jan 14, 2021
d5ef8b4
Initial query limit overriding 2
rallen090 Jan 14, 2021
5ecdb41
Initial query limit overriding 3
rallen090 Jan 14, 2021
ab8384b
Initial query limit overriding 4
rallen090 Jan 14, 2021
06ad766
PR feedback
rallen090 Jan 14, 2021
6390dcf
Build fix
rallen090 Jan 14, 2021
c9b0ad0
Fix error wording
rallen090 Jan 14, 2021
0bba45e
Build fix 2
rallen090 Jan 14, 2021
b90cc5c
Fix limit test
rallen090 Jan 14, 2021
7916470
Fix limit test 2
rallen090 Jan 14, 2021
5c6ae76
Add Override test
rallen090 Jan 14, 2021
3285a62
Add Override test 2
rallen090 Jan 15, 2021
49b6513
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 15, 2021
6df725b
Back query limits by etcd
rallen090 Jan 16, 2021
e7e23aa
Back query limits by etcd 2
rallen090 Jan 16, 2021
6c4bca4
Back query limits by etcd 3
rallen090 Jan 16, 2021
8f281b9
Build fix
rallen090 Jan 18, 2021
7b0d797
Build fix 2
rallen090 Jan 18, 2021
520f2de
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 18, 2021
2ad8c09
Build fix 3
rallen090 Jan 18, 2021
bc50860
Build fix 4
rallen090 Jan 18, 2021
f2476c5
Lint
rallen090 Jan 18, 2021
9f4d9b1
Removing endpoint to instead use etcd directly
rallen090 Jan 18, 2021
33a7bf4
Removing endpoint to instead use etcd directly 2
rallen090 Jan 18, 2021
32f148f
Add back rpc mock
rallen090 Jan 18, 2021
66a0a42
Fix test
rallen090 Jan 18, 2021
a39700f
Fix test 2
rallen090 Jan 18, 2021
0e56187
Add back m3em_mock
rallen090 Jan 18, 2021
2dbef2c
Fix gen
rallen090 Jan 18, 2021
fb29267
Gen
rallen090 Jan 18, 2021
90bef8a
Lint
rallen090 Jan 18, 2021
d3747ec
Fix tests
rallen090 Jan 18, 2021
03f987f
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 18, 2021
053739b
Rebased
rallen090 Jan 18, 2021
2272d5e
Rebased 2
rallen090 Jan 18, 2021
dddecfb
Test fix
rallen090 Jan 18, 2021
0bc4306
Lint
rallen090 Jan 18, 2021
8e249e6
Test fix 2
rallen090 Jan 18, 2021
c8507c0
Fix logs
rallen090 Jan 18, 2021
027aaae
Add limit logs and metrics
rallen090 Jan 19, 2021
40948e0
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 19, 2021
8b9b602
Test reset
rallen090 Jan 19, 2021
98efb93
Add etcd kv update endpoint 1
rallen090 Jan 19, 2021
6be4a30
Add etcd kv update endpoint 2
rallen090 Jan 19, 2021
db34398
Add etcd kv update endpoint 3
rallen090 Jan 19, 2021
02fcd56
Add etcd kv update endpoint 4
rallen090 Jan 19, 2021
8351c89
Add etcd kv update endpoint 5
rallen090 Jan 19, 2021
731ec5f
Remove reset
rallen090 Jan 19, 2021
ffcea5e
Add back mock
rallen090 Jan 19, 2021
6c42ae2
Fixing tests
rallen090 Jan 19, 2021
a86d55f
More build fixes
rallen090 Jan 19, 2021
b86fdcf
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 19, 2021
7dc8689
Integration test fix
rallen090 Jan 19, 2021
3ec7673
Add back mock
rallen090 Jan 20, 2021
6a8f9ed
PR feedback 1
rallen090 Jan 20, 2021
c05b6c9
Comment
rallen090 Jan 20, 2021
79cd2e4
Test fix
rallen090 Jan 20, 2021
0a31c53
Cleanup
rallen090 Jan 20, 2021
38cabbd
Lint
rallen090 Jan 20, 2021
5f81a75
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 20, 2021
a9621f7
Lint 2
rallen090 Jan 20, 2021
f488d68
Lint 3
rallen090 Jan 20, 2021
c5e0683
Cleanup 2
rallen090 Jan 20, 2021
316a7d3
Dep order
rallen090 Jan 20, 2021
e3480a8
Fix stop race
rallen090 Jan 20, 2021
6e15b82
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 20, 2021
450acce
Fix integration test
rallen090 Jan 20, 2021
5208b2c
PR feedback
rallen090 Jan 20, 2021
f880627
PR feedback 2
rallen090 Jan 20, 2021
4cdea4c
Add kvstore test
rallen090 Jan 20, 2021
9fc90b8
More tests
rallen090 Jan 20, 2021
9b422ad
More tests 2
rallen090 Jan 20, 2021
a3c0857
More tests 3
rallen090 Jan 20, 2021
34a5e3e
Lint
rallen090 Jan 20, 2021
bdd3ad4
Fallback to config-based limits if unset in dynamic
rallen090 Jan 20, 2021
aa89830
Fixes from feedback
rallen090 Jan 20, 2021
314e43f
Add docs around dynamic limits
rallen090 Jan 20, 2021
24b7e87
More docs
rallen090 Jan 20, 2021
8596169
More doc updates
rallen090 Jan 21, 2021
bbc474e
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 21, 2021
0967ae6
Reorganize limit setting code
rallen090 Jan 21, 2021
6f540d6
Reorganize limit setting code 2
rallen090 Jan 21, 2021
21732ef
Add more comprehensive locking
rallen090 Jan 21, 2021
4a4d5cd
Update wording
rallen090 Jan 21, 2021
5e00b23
Update docs more
rallen090 Jan 21, 2021
a943da0
Update docs more 2
rallen090 Jan 21, 2021
b659237
PR feedback
rallen090 Jan 21, 2021
7fc778a
Merge remote-tracking branch 'origin/master' into ra/dynamic-limits
rallen090 Jan 21, 2021
e07440d
PR feedback 2
rallen090 Jan 21, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ require (
golang.org/x/sys v0.0.0-20201009025420-dfb3f7c4e634
golang.org/x/tools v0.0.0-20201013201025-64a9e34f3752
google.golang.org/grpc v1.29.1
google.golang.org/protobuf v1.23.0
gopkg.in/go-ini/ini.v1 v1.57.0 // indirect
gopkg.in/go-playground/assert.v1 v1.2.1 // indirect
gopkg.in/go-playground/validator.v9 v9.7.0
Expand Down
55 changes: 55 additions & 0 deletions site/content/operational_guide/resource_limits.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,16 @@ per second safely with your deployment and you want to use the default lookback
of `15s` then you would multiply 10,000 by 15 to get 150,000 as a max value with
a 15s lookback.

The third limit `maxRecentlyQueriedSeriesDiskRead` caps the bytes associated with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make each of the limits a subsection so it's clear what each section pertains to?

series IDs matched by incoming queries. This originally was distinct from the limit
`maxRecentlyQueriedSeriesBlocks`, which also limits the memory cost of specific series
matched in-memory, because of an inefficiency in how allocations would occur even for series
known to not be present on disk for a given shard. This inefficiency has been resolved
https://github.com/m3db/m3/pull/3103 and therefore this limit should be tracking memory cost
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this a link like "resolved with this PR and..."

linearly relative to `maxRecentlyQueriedSeriesBlocks`. It is recommended to defer to using
`maxRecentlyQueriedSeriesBlocks` over `maxRecentlyQueriedSeriesDiskRead` given both should
be capping the resources in the same manner now.

### Annotated configuration

```yaml
Expand Down Expand Up @@ -82,6 +92,18 @@ limits:
# and read until the lookback period resets.
lookback: 15s

# If set, will enforce a maximum cap on the bytes read from disk that make up time series objects themselves (not their data).
# This limit can be used to ensure queries that match an extremely high volume of series can be limited before even
# reading the underlying series data from disk.
maxRecentlyQueriedSeriesDiskRead:
# Value sets the maximum disk bytes read to make up the time series objects.
rallen090 marked this conversation as resolved.
Show resolved Hide resolved
value: 0
# Lookback sets the time window that this limit is enforced over, every
# lookback period the global count is reset to zero and when the limit
# is reached it will reject any further time series blocks being matched
# and read until the lookback period resets.
lookback: 15s

# If set then will limit the number of parallel write batch requests to the
# database and return errors if hit.
maxOutstandingWriteRequests: 0
Expand All @@ -94,6 +116,39 @@ limits:
maxOutstandingReadRequests: 0
```

### Dynamic configuration

Query limits can be dynamically driven by etcd to adjust limits without redeploying. By updating the `m3db.query.limits` key in etcd, specific limits can be overriden. M3Coordinator exposes an API for updating etcd key/value pairs and so this API can be used for modifying these dynamic overrides. For example,

```
curl -vvvsSf -X POST 0.0.0.0:7201/api/v1/kvstore -d '{
"m3db.query.limits",
"value":{
"maxRecentlyQueriedSeriesDiskBytesRead": {
rallen090 marked this conversation as resolved.
Show resolved Hide resolved
"limit":0,
"lookbackSeconds":15,
"forceExceeded":false
},
"maxRecentlyQueriedSeriesBlocks": {
"limit":0,
"lookbackSeconds":15,
"forceExceeded":false
},
"maxRecentlyQueriedSeriesDiskRead": {
"limit":0,
"lookbackSeconds":15,
"forceExceeded":false
}
},
"commit":true
}'
```

Usage notes:
- Setting the `commit` flag to false allows for dry-run API calls to see the old and new limits that would be applied.
- Omitting a limit from the `value` results in that limit to be driven by the config-based settings.
- The `forceExceeded` flag makes the limit behave as though it is permanently exceeded, thus failing all queries. This is useful for dynamically shutting down all queries in cases where load may be exceeding provisioned resources.

## M3 Query and M3 Coordinator

### Deployment
Expand Down
Loading