Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: codifying runtime guarded features #6134

Merged
merged 11 commits into from
Mar 13, 2019
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,56 @@ maximize the chances of your PR being merged.
[envoy-filter-example](https://github.com/envoyproxy/envoy-filter-example) (for example making a new
branch so that CI can pass) it is your responsibility to follow through with merging those
changes back to master once the CI dance is done.
* If your PR is a high risk change, the reviewer may ask that you runtime guard
it. See the section on runtime guarding below.


# Runtime guarding

Some high risk changes in Envoy are deemed worthy of runtime guarding. Instead of just replacing
old code with new code, both code paths are supported for between one Envoy release (if it is
guarded due to performance concerns) and a full deprecation cycle (if it is a high risk behavioral
change).

The canonical way to runtime guard a feature is
```
if (Runtime::runtimeFeatureEnabled("envoy.reloadable_features.my_feature_name")) {
[new code path]
} else {
[old_code_path]
}
```
Runtime guarded features named with the "envoy.reloadable_features." prefix must be safe to flip
true or false on running Envoy instances. In some situations, for example the buffer rewrite in
[#5441](https://github.com/envoyproxy/envoy/pull/5441), it may make more sense to
latch the value in a member variable on class creation, for example:

```
bool use_new_code_path_ =
Runtime::runtimeFeatureEnabled("envoy.reloadable_features.my_feature_name")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a different namespace for this type of thing? envoy.startup_features. or similar? It's a little confusing that the docs point out that everything named like this is reloadable at runtime and then we talk about how it's not always true. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I need to remove the buffer mentions on this one. My thought for latching is if we latched at the hcm, or at a network::connection or some such, where the lifetime were transient. As called out on the buffer PR that's truly restart-only not reloadable.

I do want to support envoy.restart_features but I also need to talk to you about how to do that. In my priors, restart flags were done on the command line (only when the binary was brought up) and it was illegal to change then when pushing config (would cause that config to get rejected). Otherwise you do a config push with a new-but-buggy restart flag, the config "passes", you hot restart, and your fleet is down. Only with Envoy and hot restarts I'm less clear on how we push config which is restart only. My default plan would be to that test-only runtime tweaking flag to be a actual production runtime tweaking flag, so folks can change restart values at start-up and we can hard-fail attempts to reload them at runtime, but I recall you saying that most people don't do flags.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK yeah let's chat about that offline and maybe figure out in a follow up? I think for now maybe just clarify somehow that this should only be used for truly reloadable features and if there are features that should be startup configured it should be discussed? (Up to you how to phrase)

```

Runtime guarded features may either set true (running the new code by default) in the initial PR,
after a testing interval, or during the next release cycle, at the PR author's and reviewing
maintainer's discretion. Generally all runtime guarded features will be set true when a
release is cut, and the old code path will be deprecated at that time. Runtime features
are set true by default by inclusion in
[source/common/runtime/runtime_features.h](https://github.com/envoyproxy/envoy/blob/master/source/common/runtime/runtime_features.h)

There are four suggested options for testing new runtime features:

1. Create a per-test Runtime::LoaderSingleton as done in [DeprecatedFieldsTest.IndividualFieldDisallowedWithRuntimeOverride](https://github.com/envoyproxy/envoy/blob/master/test/common/protobuf/utility_test.cc)
2. Create a [parameterized test](https://github.com/google/googletest/blob/master/googletest/docs/advanced.md#how-to-write-value-parameterized-tests)
where the set up of the test sets the new runtime value explicitly to
GetParam() as outlined in (1).
3. Set up integration tests with custom runtime defaults as documented in the
[integration test README](https://github.com/envoyproxy/envoy/blob/master/test/integration/README.md)
4. Run a given unit test with the new runtime value explicitly set true as done
for [runtime_flag_override_test](https://github.com/envoyproxy/envoy/blob/master/test/common/runtime/BUILD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it also worth specifically calling out parameterized tests as a useful tool here? I find that particularly useful when the logical behavior is supposed to be the same regardless of flag value (ie, goal is pure refactoring or performance improvement).


Runtime code is held to the same standard as regular Envoy code, so both the old
path and the new should have 100% coverage both with the feature defaulting true
and false.

# PR review policy for maintainers

Expand Down
2 changes: 1 addition & 1 deletion docs/root/configuration/runtime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Users are encouraged to go to :repo:`DEPRECATED.md <DEPRECATED.md>` to see how t
migrate to the new code path and make sure it is suitable for their use case.

In the second phase the message and filename will be added to
:repo:`runtime_features.h <source/common/runtime/runtime_features.h>`
:repo:`runtime_features.cc <source/common/runtime/runtime_features.cc>`
and use of that configuration field will cause the config to be rejected by default.
This fail-by-default mode can be overridden in runtime configuration by setting
envoy.deprecated_features.filename.proto:fieldname to true. For example, for a deprecated field
Expand Down
7 changes: 7 additions & 0 deletions include/envoy/runtime/runtime.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,13 @@ class Snapshot {
// configuration of "false" in runtime config.
virtual bool deprecatedFeatureEnabled(const std::string& key) const PURE;

// Returns true if a runtime feature is enabled.
//
// Runtime features are used to easily allow switching between old and new code paths for high
// risk changes. The intent is for the old code path to be short lived - the old code path is
// deprecated as the feature is defaulted true, and removed with the following Envoy release.
virtual bool runtimeFeatureEnabled(const std::string& key) const PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use absl::string_view? It will avoid constructing std::string when caller call this with literals. but might require underlying methods take string_view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written I think someone could
static std::string feature = "envoy.reloadable_features.my_feature_name"));
if (Runtime::runtimeFeatureEnabled(feature) { ...} and avoid the string conversion on hash map lookup each time.

If they do the inefficient
if (Runtime::runtimeFeatureEnabled("envoy.reloadable_features.my_feature_name")
they're creating a string every time.

If I switch to string_view I think the map lookup will convert the string piece to the string on every single call, and there's no way to avoid it, so I think that's strictly less performant?

I could update the instructions to encourage static string on the canonical non-latch path if we think that makes more sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it is an absl::flat_hash_{map,set} I think heterogenous lookup means that there is no string construction in the map lookup, even though the type in the map is std::string: https://abseil.io/tips/144

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL! Ok, got that to work for the flat hash, but unfortunately to string_view all the way up the API we need to also change Runtime::EntryMap in include/envoy/runtime/runtime.h to have a comparator. I'd prefer to land this as-is and do a separate PR converting to string piece, especially as none of the call sites will need to change. What do you think of a TODO and a follow-up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fine to leave as TODO. static std::string feature = "envoy.reloadable_features.my_feature_name"; is a static non-POD object initialization so there might be static initialization fiasco. string_view constructor is safe as it is constexpr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine to me.


/**
* Test if a feature is enabled using the built in random generator. This is done by generating
* a random number in the range 0-99 and seeing if this number is < the value stored in the
Expand Down
2 changes: 1 addition & 1 deletion source/common/protobuf/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ class MessageUtil {
* @param message message to validate.
* @param loader optional a pointer to the runtime loader for live deprecation status.
* @throw ProtoValidationException if deprecated fields are used and listed
* Runtime::DisallowedFeatures
* in disallowed_features in runtime_features.h
*/
static void
checkForDeprecation(const Protobuf::Message& message,
Expand Down
5 changes: 4 additions & 1 deletion source/common/runtime/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@ envoy_package()

envoy_cc_library(
name = "runtime_lib",
srcs = ["runtime_impl.cc"],
srcs = [
"runtime_features.cc",
"runtime_impl.cc",
],
hdrs = [
"runtime_features.h",
"runtime_impl.h",
Expand Down
37 changes: 37 additions & 0 deletions source/common/runtime/runtime_features.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#include "common/runtime/runtime_features.h"

namespace Envoy {
namespace Runtime {

// Add additional features here to enable the new code paths by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to harp on this, but I think what would help me/readers here is a short description of the behavior of what an entry in runtime_features and disallowed_features do. I know this is described elsewhere, but it's hard enough to keep track of (and important enough) that IMO we should replicate a short version here?

//
// These features should not be overridden with run-time guards without a bug
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are talking to end users here, right? Can you clarify slightly and say something like, "If issues are found that that require a feature to be disabled, it should be reported upstream ASAP, ..."

// being filed on github as once high risk features are true by default, the
// old code path will be removed with the next release.
constexpr const char* runtime_features[] = {
// Enabled
"envoy.reloadable_features.test_feature_true",
// Disabled
// "envoy.reloadable_features.test_feature_false",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we mean for this to be commented out here? If this is a documented way of defaulting to disabled, can you add more comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking at some point it'd be nice to have all flags documented in that file, so if folks want to preemptively flip they can do so.

I'll remove it and wait until we have utils and see if it still makes sense

};

// TODO(alyssawilk) handle deprecation of reloadable_features. Ideally runtime
// override of a deprecated feature will log(warn) on runtime-load if not deprecated
// and hard-fail once it has been deprecated.
constexpr const char* disallowed_features[] = {
// Acts as both a test entry for deprecated.proto and a marker for the Envoy
// deprecation scripts.
"envoy.deprecated_features.deprecated.proto:is_deprecated_fatal",
};

RuntimeFeatures::RuntimeFeatures() {
for (auto& feature : disallowed_features) {
disallowed_features_.insert(feature);
}
for (auto& feature : runtime_features) {
enabled_features_.insert(feature);
}
}

} // namespace Runtime
} // namespace Envoy
29 changes: 16 additions & 13 deletions source/common/runtime/runtime_features.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,29 +9,32 @@
namespace Envoy {
namespace Runtime {

const char* disallowed_features[] = {
// Acts as both a test entry for deprecated.proto and a marker for the Envoy
// deprecation scripts.
"envoy.deprecated_features.deprecated.proto:is_deprecated_fatal",
};

class DisallowedFeatures {
// TODO(alyssawilk) convert these to string view.
class RuntimeFeatures {
public:
DisallowedFeatures() {
for (auto& feature : disallowed_features) {
disallowed_features_.insert(feature);
}
}
RuntimeFeatures();

// This tracks proto configured features, to determine if a given deprecated
// feature is still allowed, or has been made fatal-by-default per the Envoy
// deprecation process.
bool disallowedByDefault(const std::string& feature) const {
return disallowed_features_.find(feature) != disallowed_features_.end();
}

// This tracks config-guarded code paths, to determine if a given
// runtime-guarded-code-path has the new code run by default or the old code.
bool enabledByDefault(const std::string& feature) const {
return enabled_features_.find(feature) != enabled_features_.end();
}

private:
friend class RuntimeFeaturesPeer;

absl::flat_hash_set<std::string> disallowed_features_;
absl::flat_hash_set<std::string> enabled_features_;
};

using DisallowedFeaturesDefaults = ConstSingleton<DisallowedFeatures>;
using RuntimeFeaturesDefaults = ConstSingleton<RuntimeFeatures>;

} // namespace Runtime
} // namespace Envoy
24 changes: 23 additions & 1 deletion source/common/runtime/runtime_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,16 @@
namespace Envoy {
namespace Runtime {

bool runtimeFeatureEnabled(const std::string& feature) {
ASSERT(absl::StartsWith(feature, "envoy.reloadable_features"));
if (Runtime::LoaderSingleton::getExisting()) {
return Runtime::LoaderSingleton::getExisting()->snapshot().runtimeFeatureEnabled(feature);
}
ENVOY_LOG_TO_LOGGER(Envoy::Logger::Registry::getLog(Envoy::Logger::Id::runtime), warn,
"Unable to use runtime singleton for feature {}", feature);
return RuntimeFeaturesDefaults::get().enabledByDefault(feature);
}

const size_t RandomGeneratorImpl::UUID_LENGTH = 36;

uint64_t RandomGeneratorImpl::random() {
Expand Down Expand Up @@ -152,7 +162,7 @@ bool SnapshotImpl::deprecatedFeatureEnabled(const std::string& key) const {
bool stored = getBoolean(key, allowed);
// If not, the default value is based on disallowedByDefault.
if (!stored) {
allowed = !DisallowedFeaturesDefaults::get().disallowedByDefault(key);
allowed = !RuntimeFeaturesDefaults::get().disallowedByDefault(key);
}

if (!allowed) {
Expand All @@ -165,6 +175,18 @@ bool SnapshotImpl::deprecatedFeatureEnabled(const std::string& key) const {
return true;
}

bool SnapshotImpl::runtimeFeatureEnabled(const std::string& key) const {
bool enabled = false;
// See if this value is explicitly set as a runtime boolean.
bool stored = getBoolean(key, enabled);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can just be inlined in the if statement below?

// If not, the default value is based on runtime_features.
if (!stored) {
enabled = RuntimeFeaturesDefaults::get().enabledByDefault(key);
}

return enabled;
}

bool SnapshotImpl::featureEnabled(const std::string& key, uint64_t default_value,
uint64_t random_value, uint64_t num_buckets) const {
return random_value % num_buckets < std::min(getInteger(key, default_value), num_buckets);
Expand Down
3 changes: 3 additions & 0 deletions source/common/runtime/runtime_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
namespace Envoy {
namespace Runtime {

bool runtimeFeatureEnabled(const std::string& feature);

using RuntimeSingleton = ThreadSafeSingleton<Loader>;

/**
Expand Down Expand Up @@ -72,6 +74,7 @@ class SnapshotImpl : public Snapshot,

// Runtime::Snapshot
bool deprecatedFeatureEnabled(const std::string& key) const override;
bool runtimeFeatureEnabled(const std::string& key) const override;
bool featureEnabled(const std::string& key, uint64_t default_value, uint64_t random_value,
uint64_t num_buckets) const override;
bool featureEnabled(const std::string& key, uint64_t default_value) const override;
Expand Down
23 changes: 23 additions & 0 deletions test/common/runtime/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ licenses(["notice"]) # Apache 2
load(
"//bazel:envoy_build_system.bzl",
"envoy_cc_test",
"envoy_cc_test_library",
"envoy_package",
)

Expand All @@ -15,6 +16,16 @@ filegroup(
srcs = glob(["test_data/**"]),
)

envoy_cc_test_library(
name = "utility_lib",
hdrs = [
"utility.h",
],
deps = [
"//source/common/runtime:runtime_lib",
],
)

envoy_cc_test(
name = "runtime_impl_test",
srcs = ["runtime_impl_test.cc"],
Expand All @@ -31,6 +42,18 @@ envoy_cc_test(
],
)

envoy_cc_test(
name = "runtime_flag_override_test",
srcs = ["runtime_flag_override_test.cc"],
args = [
"--runtime-feature-override-for-tests=envoy.reloadable_features.test_feature_false",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this seems like a generally useful feature for runtime configuration where people don't want to use the filesystem (CLI/gflags implementation like we discussed). Is it worth adding a TODO or opening a help wanted issue on that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if possible can you rename envoy.reloadable_features.test_feature_false to something else? I found it pretty confusing to try to sort out the logic with the built in envoy.reloadable_features.test_feature_true. Maybe more comments would help also or just a different name like envoy.reloadable_features.injected_test_feature and envoy.reloadable_features.built_in_test_feature or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think other folks would use this? I thought from prior discussions you thought google was flag happy but most folks wanted to use the existing runtime.
I think if we're going to use it out of test we have to allow for feature=true,feature2=false which I can do pretty easily.

Not sure if removing test_feature_false helps - I've clarified the language in both tests to hopefully clarify things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that Google is an outlier in using flags for this type of stuff, but I was just suggesting that this code might be generally useful to someone, so probably worth tracking somewhere in case it comes up again. I think it's totally fine to leave as test only code for now.

Thanks for all the comments, it's much easier to understand for me now.

],
coverage = False,
deps = [
"//source/common/runtime:runtime_lib",
],
)

envoy_cc_test(
name = "uuid_util_test",
srcs = ["uuid_util_test.cc"],
Expand Down
15 changes: 15 additions & 0 deletions test/common/runtime/runtime_flag_override_test.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#include "common/runtime/runtime_impl.h"

#include "gmock/gmock.h"

namespace Envoy {
namespace Runtime {

// In the envoy_cc_test declaration, the flag is set
// "--runtime-feature-override-for-tests=envoy.reloadable_features.test_feature_false"
TEST(RuntimeFlagOverrideTest, OverridesWork) {
EXPECT_TRUE(Runtime::runtimeFeatureEnabled("envoy.reloadable_features.test_feature_false"));
}

} // namespace Runtime
} // namespace Envoy
17 changes: 17 additions & 0 deletions test/common/runtime/runtime_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,14 @@ TEST_F(DiskBackedLoaderImplTest, All) {
// File1 is not a boolean.
EXPECT_EQ(false, snapshot->getBoolean("file1", value));

// Feature defaults.
EXPECT_EQ(false, snapshot->runtimeFeatureEnabled("envoy.reloadable_features.test_feature_false"));
EXPECT_EQ(true, snapshot->runtimeFeatureEnabled("envoy.reloadable_features.test_feature_true"));

// Feature defaults via helper function.
EXPECT_EQ(false, runtimeFeatureEnabled("envoy.reloadable_features.test_feature_false"));
EXPECT_EQ(true, runtimeFeatureEnabled("envoy.reloadable_features.test_feature_true"));

// Files with comments.
EXPECT_EQ(123UL, loader->snapshot().getInteger("file5", 1));
EXPECT_EQ("/home#about-us", loader->snapshot().get("file6"));
Expand Down Expand Up @@ -297,5 +305,14 @@ TEST_F(DiskLayerTest, Loop) {
EnvoyException, "Walk recursion depth exceded 16");
}

TEST(NoRuntime, FeatureEnabled) {
// Make sure the registry is not set up.
ASSERT_TRUE(Runtime::LoaderSingleton::getExisting() == nullptr);

// Feature defaults should still work.
EXPECT_EQ(false, runtimeFeatureEnabled("envoy.reloadable_features.test_feature_false"));
EXPECT_EQ(true, runtimeFeatureEnabled("envoy.reloadable_features.test_feature_true"));
}

} // namespace Runtime
} // namespace Envoy
22 changes: 22 additions & 0 deletions test/common/runtime/utility.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#pragma once

#include "common/runtime/runtime_features.h"

namespace Envoy {
namespace Runtime {

class RuntimeFeaturesPeer {
public:
static bool addFeature(const std::string& feature) {
return const_cast<Runtime::RuntimeFeatures*>(&Runtime::RuntimeFeaturesDefaults::get())
->enabled_features_.insert(feature)
.second;
}
static void removeFeature(const std::string& feature) {
const_cast<Runtime::RuntimeFeatures*>(&Runtime::RuntimeFeaturesDefaults::get())
->enabled_features_.erase(feature);
}
};

} // namespace Runtime
} // namespace Envoy
1 change: 1 addition & 0 deletions test/mocks/runtime/mocks.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ class MockSnapshot : public Snapshot {
~MockSnapshot() override;

MOCK_CONST_METHOD1(deprecatedFeatureEnabled, bool(const std::string& key));
MOCK_CONST_METHOD1(runtimeFeatureEnabled, bool(const std::string& key));
MOCK_CONST_METHOD2(featureEnabled, bool(const std::string& key, uint64_t default_value));
MOCK_CONST_METHOD3(featureEnabled,
bool(const std::string& key, uint64_t default_value, uint64_t random_value));
Expand Down
2 changes: 2 additions & 0 deletions test/test_common/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ envoy_cc_test_library(
"//source/common/json:json_loader_lib",
"//source/common/network:utility_lib",
"//source/server:options_lib",
"//test/common/runtime:utility_lib",
"//test/mocks/server:server_mocks",
],
)

Expand Down
Loading