-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overload: scale selected timers in response to load #13475
Changes from 28 commits
90f775a
0dd57aa
151f1ca
c5c2be9
26a4b2d
f95bd82
effc203
324198d
b8e5e1e
dc82f73
b3f1c99
794e9e1
5808eed
e29329b
20c8af4
08f1331
226c183
4b00ebe
72916ca
99fb591
f7736e7
8fe5101
bbd65fd
6bbcd2a
43f0673
94821c0
28500e1
7825f39
2e2d5e0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,8 @@ syntax = "proto3"; | |
|
||
package envoy.config.overload.v3; | ||
|
||
import "envoy/type/v3/percent.proto"; | ||
|
||
import "google/protobuf/any.proto"; | ||
import "google/protobuf/duration.proto"; | ||
import "google/protobuf/struct.proto"; | ||
|
@@ -80,6 +82,38 @@ message Trigger { | |
} | ||
} | ||
|
||
// Typed configuration for the "envoy.overload_actions.reduce_timeouts" action. See | ||
// :ref:`the docs <config_overload_manager_reducing_timeouts>` for an example of how to configure | ||
// the action with different timeouts and minimum values. | ||
message ScaleTimersOverloadActionConfig { | ||
enum TimerType { | ||
// Unsupported value; users must explicitly specify the timer they want scaled. | ||
UNSPECIFIED = 0; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the reason this is specified to force the user to explicitly configure the timeout they want? I think that's OK but can you add a small comment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, done. I would have liked to just start at 1 but the compiler (or maybe lint checker?) complains. |
||
|
||
// Adjusts the idle timer for downstream HTTP connections that takes effect when there are no active streams. | ||
// This affects the value of :ref:`RouteAction.idle_timeout <envoy_v3_api_field_config.route.v3.RouteAction.idle_timeout>`. | ||
HTTP_DOWNSTREAM_CONNECTION_IDLE = 1; | ||
} | ||
|
||
message ScaleTimer { | ||
// The type of timer this minimum applies to. | ||
TimerType timer = 1 [(validate.rules).enum = {defined_only: true not_in: 0}]; | ||
|
||
oneof overload_adjust { | ||
option (validate.required) = true; | ||
|
||
// Sets the minimum duration as an absolute value. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From reading the docs, it's not immediately clear to me how min_timeout interacts with the scaling. Can you flesh this out a bit more to better describe how everything fits together in terms of scaling? An example would probably help. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Elaborated in the docs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks the example is very helpful. Can you potentially ref link to the docs somehow from here just so that people can go back and forth if they are confused what this does? /wait There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, done. |
||
google.protobuf.Duration min_timeout = 2; | ||
|
||
// Sets the minimum duration as a percentage of the maximum value. | ||
type.v3.Percent min_scale = 3; | ||
Comment on lines
+108
to
+109
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like this might be a way to shoot yourself in the foot as it's not as intuitive and easy to compare as absolute time -- unless you end up doing the calculation yourself in which case maybe just use the min_timeout to set the absolute duration. If this gets set too low, I could see the aggressive timeouts causing cascading failures i.e. user requests times out so they end up refreshing the page. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, though a lot of the overload manger actions let you shoot yourself in the foot. Having a scale factor allows this to apply meaningfully to timers that can have different values across listeners. |
||
} | ||
} | ||
|
||
// A set of timer scaling rules to be applied. | ||
repeated ScaleTimer timer_scale_factors = 1 [(validate.rules).repeated = {min_items: 1}]; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add release notes describing this new feature. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
message OverloadAction { | ||
option (udpa.annotations.versioning).previous_message_type = | ||
"envoy.config.overload.v2alpha.OverloadAction"; | ||
|
@@ -93,6 +127,9 @@ message OverloadAction { | |
// state of all triggers, which can be scaling between 0 and 1 or saturated. Listeners | ||
// are notified when the overload action changes state. | ||
repeated Trigger triggers = 2 [(validate.rules).repeated = {min_items: 1}]; | ||
|
||
// Configuration for the action being instantiated. | ||
google.protobuf.Any typed_config = 3; | ||
} | ||
|
||
message OverloadManager { | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
#include <string> | ||
|
||
#include "envoy/common/pure.h" | ||
#include "envoy/event/timer.h" | ||
#include "envoy/thread_local/thread_local.h" | ||
|
||
#include "common/common/macros.h" | ||
|
@@ -39,13 +40,25 @@ class OverloadActionState { | |
*/ | ||
using OverloadActionCb = std::function<void(OverloadActionState)>; | ||
|
||
enum class OverloadTimerType { | ||
// Timers created with this type will never be scaled. This should only be used for testing. | ||
UnscaledRealTimerForTest, | ||
// The amount of time an HTTP connection to a downstream client can remain idle (no streams). This | ||
// corresponds to the HTTP_DOWNSTREAM_CONNECTION_IDLE TimerType in overload.proto. | ||
HttpDownstreamIdleConnectionTimeout, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider adding a reference in this comment to the real parameter associated with this enum value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
}; | ||
|
||
/** | ||
* Thread-local copy of the state of each configured overload action. | ||
*/ | ||
class ThreadLocalOverloadState : public ThreadLocal::ThreadLocalObject { | ||
public: | ||
// Get a thread-local reference to the value for the given action key. | ||
virtual const OverloadActionState& getState(const std::string& action) PURE; | ||
|
||
// Get a scaled timer whose minimum corresponds to the configured value for the given timer type. | ||
virtual Event::TimerPtr createScaledTimer(OverloadTimerType timer_type, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
Event::TimerCb callback) PURE; | ||
}; | ||
|
||
/** | ||
|
@@ -68,6 +81,9 @@ class OverloadActionNameValues { | |
|
||
// Overload action to try to shrink the heap by releasing free memory. | ||
const std::string ShrinkHeap = "envoy.overload_actions.shrink_heap"; | ||
|
||
// Overload action to reduce some subset of configured timeouts. | ||
const std::string ReduceTimeouts = "envoy.overload_actions.reduce_timeouts"; | ||
}; | ||
|
||
using OverloadActionNames = ConstSingleton<OverloadActionNameValues>; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no mentions of this config corresponding to the "envoy.overload_actions.reduce_timeouts" action.
Should envoy.overload_actions.reduce_timeouts be mentioned in this config message comment?
Should there be some doc updates that mention how to configure this action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.