-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: codifying runtime guarded features #6134
Changes from 7 commits
5f368a3
0147b76
7f3fc25
3ac22ff
33bbf5e
ad5d657
5b838e8
26bdbd6
1cc4738
b5cf76c
e063208
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -137,6 +137,56 @@ maximize the chances of your PR being merged. | |
[envoy-filter-example](https://github.com/envoyproxy/envoy-filter-example) (for example making a new | ||
branch so that CI can pass) it is your responsibility to follow through with merging those | ||
changes back to master once the CI dance is done. | ||
* If your PR is a high risk change, the reviewer may ask that you runtime guard | ||
it. See the section on runtime guarding below. | ||
|
||
|
||
# Runtime guarding | ||
|
||
Some high risk changes in Envoy are deemed worthy of runtime guarding. Instead of just replacing | ||
old code with new code, both code paths are supported for between one Envoy release (if it is | ||
guarded due to performance concerns) and a full deprecation cycle (if it is a high risk behavioral | ||
change). | ||
|
||
The canonical way to runtime guard a feature is | ||
``` | ||
if (Runtime::runtimeFeatureEnabled("envoy.reloadable_features.my_feature_name")) { | ||
[new code path] | ||
} else { | ||
[old_code_path] | ||
} | ||
``` | ||
Runtime guarded features named with the "envoy.reloadable_features." prefix must be safe to flip | ||
true or false on running Envoy instances. In some situations, for example the buffer rewrite in | ||
[#5441](https://github.com/envoyproxy/envoy/pull/5441), it may make more sense to | ||
latch the value in a member variable on class creation, for example: | ||
|
||
``` | ||
bool use_new_code_path_ = | ||
Runtime::runtimeFeatureEnabled("envoy.reloadable_features.my_feature_name") | ||
``` | ||
|
||
Runtime guarded features may either set true (running the new code by default) in the initial PR, | ||
after a testing interval, or during the next release cycle, at the PR author's and reviewing | ||
maintainer's discretion. Generally all runtime guarded features will be set true when a | ||
release is cut, and the old code path will be deprecated at that time. Runtime features | ||
are set true by default by inclusion in | ||
[source/common/runtime/runtime_features.h](https://github.com/envoyproxy/envoy/blob/master/source/common/runtime/runtime_features.h) | ||
|
||
There are four suggested options for testing new runtime features: | ||
|
||
1. Create a per-test Runtime::LoaderSingleton as done in [DeprecatedFieldsTest.IndividualFieldDisallowedWithRuntimeOverride](https://github.com/envoyproxy/envoy/blob/master/test/common/protobuf/utility_test.cc) | ||
2. Create a [parameterized test](https://github.com/google/googletest/blob/master/googletest/docs/advanced.md#how-to-write-value-parameterized-tests) | ||
where the set up of the test sets the new runtime value explicitly to | ||
GetParam() as outlined in (1). | ||
3. Set up integration tests with custom runtime defaults as documented in the | ||
[integration test README](https://github.com/envoyproxy/envoy/blob/master/test/integration/README.md) | ||
4. Run a given unit test with the new runtime value explicitly set true as done | ||
for [runtime_flag_override_test](https://github.com/envoyproxy/envoy/blob/master/test/common/runtime/BUILD) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it also worth specifically calling out parameterized tests as a useful tool here? I find that particularly useful when the logical behavior is supposed to be the same regardless of flag value (ie, goal is pure refactoring or performance improvement). |
||
|
||
Runtime code is held to the same standard as regular Envoy code, so both the old | ||
path and the new should have 100% coverage both with the feature defaulting true | ||
and false. | ||
|
||
# PR review policy for maintainers | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -83,6 +83,13 @@ class Snapshot { | |
// configuration of "false" in runtime config. | ||
virtual bool deprecatedFeatureEnabled(const std::string& key) const PURE; | ||
|
||
// Returns true if a runtime feature is enabled. | ||
// | ||
// Runtime features are used to easily allow switching between old and new code paths for high | ||
// risk changes. The intent is for the old code path to be short lived - the old code path is | ||
// deprecated as the feature is defaulted true, and removed with the following Envoy release. | ||
virtual bool runtimeFeatureEnabled(const std::string& key) const PURE; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As written I think someone could If they do the inefficient If I switch to string_view I think the map lookup will convert the string piece to the string on every single call, and there's no way to avoid it, so I think that's strictly less performant? I could update the instructions to encourage static string on the canonical non-latch path if we think that makes more sense There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As long as it is an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TIL! Ok, got that to work for the flat hash, but unfortunately to string_view all the way up the API we need to also change Runtime::EntryMap in include/envoy/runtime/runtime.h to have a comparator. I'd prefer to land this as-is and do a separate PR converting to string piece, especially as none of the call sites will need to change. What do you think of a TODO and a follow-up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah fine to leave as TODO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds fine to me. |
||
|
||
/** | ||
* Test if a feature is enabled using the built in random generator. This is done by generating | ||
* a random number in the range 0-99 and seeing if this number is < the value stored in the | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#include "common/runtime/runtime_features.h" | ||
|
||
namespace Envoy { | ||
namespace Runtime { | ||
|
||
// Add additional features here to enable the new code paths by default. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry to harp on this, but I think what would help me/readers here is a short description of the behavior of what an entry in runtime_features and disallowed_features do. I know this is described elsewhere, but it's hard enough to keep track of (and important enough) that IMO we should replicate a short version here? |
||
// | ||
// These features should not be overridden with run-time guards without a bug | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you are talking to end users here, right? Can you clarify slightly and say something like, "If issues are found that that require a feature to be disabled, it should be reported upstream ASAP, ..." |
||
// being filed on github as once high risk features are true by default, the | ||
// old code path will be removed with the next release. | ||
constexpr const char* runtime_features[] = { | ||
// Enabled | ||
"envoy.reloadable_features.test_feature_true", | ||
// Disabled | ||
// "envoy.reloadable_features.test_feature_false", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we mean for this to be commented out here? If this is a documented way of defaulting to disabled, can you add more comments? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking at some point it'd be nice to have all flags documented in that file, so if folks want to preemptively flip they can do so. I'll remove it and wait until we have utils and see if it still makes sense |
||
}; | ||
|
||
// TODO(alyssawilk) handle deprecation of reloadable_features. Ideally runtime | ||
// override of a deprecated feature will log(warn) on runtime-load if not deprecated | ||
// and hard-fail once it has been deprecated. | ||
constexpr const char* disallowed_features[] = { | ||
// Acts as both a test entry for deprecated.proto and a marker for the Envoy | ||
// deprecation scripts. | ||
"envoy.deprecated_features.deprecated.proto:is_deprecated_fatal", | ||
}; | ||
|
||
RuntimeFeatures::RuntimeFeatures() { | ||
for (auto& feature : disallowed_features) { | ||
disallowed_features_.insert(feature); | ||
} | ||
for (auto& feature : runtime_features) { | ||
enabled_features_.insert(feature); | ||
} | ||
} | ||
|
||
} // namespace Runtime | ||
} // namespace Envoy |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,6 +22,16 @@ | |
namespace Envoy { | ||
namespace Runtime { | ||
|
||
bool runtimeFeatureEnabled(const std::string& feature) { | ||
ASSERT(absl::StartsWith(feature, "envoy.reloadable_features")); | ||
if (Runtime::LoaderSingleton::getExisting()) { | ||
return Runtime::LoaderSingleton::getExisting()->snapshot().runtimeFeatureEnabled(feature); | ||
} | ||
ENVOY_LOG_TO_LOGGER(Envoy::Logger::Registry::getLog(Envoy::Logger::Id::runtime), warn, | ||
"Unable to use runtime singleton for feature {}", feature); | ||
return RuntimeFeaturesDefaults::get().enabledByDefault(feature); | ||
} | ||
|
||
const size_t RandomGeneratorImpl::UUID_LENGTH = 36; | ||
|
||
uint64_t RandomGeneratorImpl::random() { | ||
|
@@ -152,7 +162,7 @@ bool SnapshotImpl::deprecatedFeatureEnabled(const std::string& key) const { | |
bool stored = getBoolean(key, allowed); | ||
// If not, the default value is based on disallowedByDefault. | ||
if (!stored) { | ||
allowed = !DisallowedFeaturesDefaults::get().disallowedByDefault(key); | ||
allowed = !RuntimeFeaturesDefaults::get().disallowedByDefault(key); | ||
} | ||
|
||
if (!allowed) { | ||
|
@@ -165,6 +175,18 @@ bool SnapshotImpl::deprecatedFeatureEnabled(const std::string& key) const { | |
return true; | ||
} | ||
|
||
bool SnapshotImpl::runtimeFeatureEnabled(const std::string& key) const { | ||
bool enabled = false; | ||
// See if this value is explicitly set as a runtime boolean. | ||
bool stored = getBoolean(key, enabled); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: this can just be inlined in the if statement below? |
||
// If not, the default value is based on runtime_features. | ||
if (!stored) { | ||
enabled = RuntimeFeaturesDefaults::get().enabledByDefault(key); | ||
} | ||
|
||
return enabled; | ||
} | ||
|
||
bool SnapshotImpl::featureEnabled(const std::string& key, uint64_t default_value, | ||
uint64_t random_value, uint64_t num_buckets) const { | ||
return random_value % num_buckets < std::min(getInteger(key, default_value), num_buckets); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ licenses(["notice"]) # Apache 2 | |
load( | ||
"//bazel:envoy_build_system.bzl", | ||
"envoy_cc_test", | ||
"envoy_cc_test_library", | ||
"envoy_package", | ||
) | ||
|
||
|
@@ -15,6 +16,16 @@ filegroup( | |
srcs = glob(["test_data/**"]), | ||
) | ||
|
||
envoy_cc_test_library( | ||
name = "utility_lib", | ||
hdrs = [ | ||
"utility.h", | ||
], | ||
deps = [ | ||
"//source/common/runtime:runtime_lib", | ||
], | ||
) | ||
|
||
envoy_cc_test( | ||
name = "runtime_impl_test", | ||
srcs = ["runtime_impl_test.cc"], | ||
|
@@ -31,6 +42,18 @@ envoy_cc_test( | |
], | ||
) | ||
|
||
envoy_cc_test( | ||
name = "runtime_flag_override_test", | ||
srcs = ["runtime_flag_override_test.cc"], | ||
args = [ | ||
"--runtime-feature-override-for-tests=envoy.reloadable_features.test_feature_false", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW, this seems like a generally useful feature for runtime configuration where people don't want to use the filesystem (CLI/gflags implementation like we discussed). Is it worth adding a TODO or opening a help wanted issue on that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, if possible can you rename There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you think other folks would use this? I thought from prior discussions you thought google was flag happy but most folks wanted to use the existing runtime. Not sure if removing test_feature_false helps - I've clarified the language in both tests to hopefully clarify things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do think that Google is an outlier in using flags for this type of stuff, but I was just suggesting that this code might be generally useful to someone, so probably worth tracking somewhere in case it comes up again. I think it's totally fine to leave as test only code for now. Thanks for all the comments, it's much easier to understand for me now. |
||
], | ||
coverage = False, | ||
deps = [ | ||
"//source/common/runtime:runtime_lib", | ||
], | ||
) | ||
|
||
envoy_cc_test( | ||
name = "uuid_util_test", | ||
srcs = ["uuid_util_test.cc"], | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#include "common/runtime/runtime_impl.h" | ||
|
||
#include "gmock/gmock.h" | ||
|
||
namespace Envoy { | ||
namespace Runtime { | ||
|
||
// In the envoy_cc_test declaration, the flag is set | ||
// "--runtime-feature-override-for-tests=envoy.reloadable_features.test_feature_false" | ||
TEST(RuntimeFlagOverrideTest, OverridesWork) { | ||
EXPECT_TRUE(Runtime::runtimeFeatureEnabled("envoy.reloadable_features.test_feature_false")); | ||
} | ||
|
||
} // namespace Runtime | ||
} // namespace Envoy |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#pragma once | ||
|
||
#include "common/runtime/runtime_features.h" | ||
|
||
namespace Envoy { | ||
namespace Runtime { | ||
|
||
class RuntimeFeaturesPeer { | ||
public: | ||
static bool addFeature(const std::string& feature) { | ||
return const_cast<Runtime::RuntimeFeatures*>(&Runtime::RuntimeFeaturesDefaults::get()) | ||
->enabled_features_.insert(feature) | ||
.second; | ||
} | ||
static void removeFeature(const std::string& feature) { | ||
const_cast<Runtime::RuntimeFeatures*>(&Runtime::RuntimeFeaturesDefaults::get()) | ||
->enabled_features_.erase(feature); | ||
} | ||
}; | ||
|
||
} // namespace Runtime | ||
} // namespace Envoy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a different namespace for this type of thing?
envoy.startup_features.
or similar? It's a little confusing that the docs point out that everything named like this is reloadable at runtime and then we talk about how it's not always true. :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I need to remove the buffer mentions on this one. My thought for latching is if we latched at the hcm, or at a network::connection or some such, where the lifetime were transient. As called out on the buffer PR that's truly restart-only not reloadable.
I do want to support envoy.restart_features but I also need to talk to you about how to do that. In my priors, restart flags were done on the command line (only when the binary was brought up) and it was illegal to change then when pushing config (would cause that config to get rejected). Otherwise you do a config push with a new-but-buggy restart flag, the config "passes", you hot restart, and your fleet is down. Only with Envoy and hot restarts I'm less clear on how we push config which is restart only. My default plan would be to that test-only runtime tweaking flag to be a actual production runtime tweaking flag, so folks can change restart values at start-up and we can hard-fail attempts to reload them at runtime, but I recall you saying that most people don't do flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK yeah let's chat about that offline and maybe figure out in a follow up? I think for now maybe just clarify somehow that this should only be used for truly reloadable features and if there are features that should be startup configured it should be discussed? (Up to you how to phrase)