Skip to content

Commit

Permalink
Promote ContinueReconciliationOnManualRollingUpdateFailure feature …
Browse files Browse the repository at this point in the history
…gate to beta (#10524)

Signed-off-by: Jakub Scholz <www@scholzj.com>
  • Loading branch information
scholzj authored Sep 2, 2024
1 parent e7e8b11 commit 36f1109
Show file tree
Hide file tree
Showing 15 changed files with 52 additions and 72 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ jobs:
display_name: 'feature-gates-regression-bundle I. - kafka + oauth'
profile: 'azp_kafka_oauth'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand All @@ -17,7 +17,7 @@ jobs:
display_name: 'feature-gates-regression-bundle II. - security'
profile: 'azp_security'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand All @@ -29,7 +29,7 @@ jobs:
display_name: 'feature-gates-regression-bundle III. - dynconfig + tracing + watcher'
profile: 'azp_dynconfig_listeners_tracing_watcher'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand All @@ -41,7 +41,7 @@ jobs:
display_name: 'feature-gates-regression-bundle IV. - operators'
profile: 'azp_operators'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand All @@ -53,7 +53,7 @@ jobs:
display_name: 'feature-gates-regression-bundle V. - rollingupdate'
profile: 'azp_rolling_update_bridge'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand All @@ -65,7 +65,7 @@ jobs:
display_name: 'feature-gates-regression-bundle VI. - connect + mirrormaker'
profile: 'azp_connect_mirrormaker'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand All @@ -77,7 +77,7 @@ jobs:
display_name: 'feature-gates-regression-bundle VII. - remaining system tests'
profile: 'azp_remaining'
cluster_operator_install_type: 'bundle'
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
timeout: 360
releaseVersion: '${{ parameters.releaseVersion }}'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
Expand All @@ -22,7 +22,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
Expand All @@ -36,7 +36,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
Expand All @@ -50,7 +50,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
Expand All @@ -64,7 +64,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
Expand All @@ -78,7 +78,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
Expand All @@ -92,7 +92,7 @@ jobs:
cluster_operator_install_type: 'bundle'
timeout: 360
strimzi_rbac_scope: NAMESPACE
strimzi_feature_gates: '+ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_feature_gates: '-ContinueReconciliationOnManualRollingUpdateFailure'
strimzi_use_node_pools_in_tests: "false"
releaseVersion: '${{ parameters.releaseVersion }}'
kafkaVersion: '${{ parameters.kafkaVersion }}'
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
## 0.44.0

* Add the "Unmanaged" KafkaTopic status update.
* The `ContinueReconciliationOnManualRollingUpdateFailure` feature gate moves to beta stage and is enabled by default.
If needed, `ContinueReconciliationOnManualRollingUpdateFailure` can be disabled in the feature gates configuration in the Cluster Operator.

### Changes, deprecations and removals

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ public class ClusterOperatorConfigTest {
ENV_VARS.put(ClusterOperatorConfig.STRIMZI_KAFKA_MIRROR_MAKER_IMAGES, KafkaVersionTestUtils.getKafkaMirrorMakerImagesEnvVarString());
ENV_VARS.put(ClusterOperatorConfig.STRIMZI_KAFKA_MIRROR_MAKER_2_IMAGES, KafkaVersionTestUtils.getKafkaMirrorMaker2ImagesEnvVarString());
ENV_VARS.put(ClusterOperatorConfig.OPERATOR_NAMESPACE.key(), "operator-namespace");
ENV_VARS.put(ClusterOperatorConfig.FEATURE_GATES.key(), "+ContinueReconciliationOnManualRollingUpdateFailure");
ENV_VARS.put(ClusterOperatorConfig.FEATURE_GATES.key(), "-ContinueReconciliationOnManualRollingUpdateFailure");
ENV_VARS.put(ClusterOperatorConfig.DNS_CACHE_TTL.key(), "10");
ENV_VARS.put(ClusterOperatorConfig.POD_SECURITY_PROVIDER_CLASS.key(), "my.package.CustomPodSecurityProvider");
}
Expand Down Expand Up @@ -100,7 +100,7 @@ public void testEnvVars() {
assertThat(config.getOperationTimeoutMs(), is(30_000L));
assertThat(config.getConnectBuildTimeoutMs(), is(40_000L));
assertThat(config.getOperatorNamespace(), is("operator-namespace"));
assertThat(config.featureGates().continueOnManualRUFailureEnabled(), is(true));
assertThat(config.featureGates().continueOnManualRUFailureEnabled(), is(false));
assertThat(config.getDnsCacheTtlSec(), is(10));
assertThat(config.getPodSecurityProviderClass(), is("my.package.CustomPodSecurityProvider"));
}
Expand All @@ -117,6 +117,7 @@ public void testEnvVarsDefault() {
assertThat(config.getOperationTimeoutMs(), is(Long.parseLong(ClusterOperatorConfig.OPERATION_TIMEOUT_MS.defaultValue())));
assertThat(config.getOperatorNamespace(), is(nullValue()));
assertThat(config.getOperatorNamespaceLabels(), is(nullValue()));
assertThat(config.featureGates().continueOnManualRUFailureEnabled(), is(true));
assertThat(config.getDnsCacheTtlSec(), is(Integer.parseInt(ClusterOperatorConfig.DNS_CACHE_TTL.defaultValue())));
assertThat(config.getPodSecurityProviderClass(), is(ClusterOperatorConfig.POD_SECURITY_PROVIDER_CLASS.defaultValue()));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -785,14 +785,14 @@ public void testUserOperatorAndTopicOperatorNetworkPolicy() {
@ParallelTest
public void testFeatureGateEnvVars() {
ClusterOperatorConfig config = new ClusterOperatorConfig.ClusterOperatorConfigBuilder(ResourceUtils.dummyClusterOperatorConfig(), VERSIONS)
.with(ClusterOperatorConfig.FEATURE_GATES.key(), "+ContinueReconciliationOnManualRollingUpdateFailure")
.with(ClusterOperatorConfig.FEATURE_GATES.key(), "-ContinueReconciliationOnManualRollingUpdateFailure")
.build();

EntityOperator eo = EntityOperator.fromCrd(new Reconciliation("test", KAFKA.getKind(), KAFKA.getMetadata().getNamespace(), KAFKA.getMetadata().getName()), KAFKA, SHARED_ENV_PROVIDER, config);
Deployment dep = eo.generateDeployment(Map.of(), false, null, null);

assertThat(dep.getSpec().getTemplate().getSpec().getContainers().get(0).getEnv().stream().filter(env -> "STRIMZI_FEATURE_GATES".equals(env.getName())).map(EnvVar::getValue).findFirst().orElseThrow(), is("+ContinueReconciliationOnManualRollingUpdateFailure"));
assertThat(dep.getSpec().getTemplate().getSpec().getContainers().get(1).getEnv().stream().filter(env -> "STRIMZI_FEATURE_GATES".equals(env.getName())).map(EnvVar::getValue).findFirst().orElseThrow(), is("+ContinueReconciliationOnManualRollingUpdateFailure"));
assertThat(dep.getSpec().getTemplate().getSpec().getContainers().get(0).getEnv().stream().filter(env -> "STRIMZI_FEATURE_GATES".equals(env.getName())).map(EnvVar::getValue).findFirst().orElseThrow(), is("-ContinueReconciliationOnManualRollingUpdateFailure"));
assertThat(dep.getSpec().getTemplate().getSpec().getContainers().get(1).getEnv().stream().filter(env -> "STRIMZI_FEATURE_GATES".equals(env.getName())).map(EnvVar::getValue).findFirst().orElseThrow(), is("-ContinueReconciliationOnManualRollingUpdateFailure"));
}

////////////////////
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -357,12 +357,12 @@ public void testManualPodRollingUpdateWithPodSets(VertxTestContext context) {

@Test
public void testManualPodRollingUpdateWithPodSetsWithError1(VertxTestContext context) {
testManualPodRollingUpdateWithPodSetsWithErrorConditions(context, "", true);
testManualPodRollingUpdateWithPodSetsWithErrorConditions(context, "-ContinueReconciliationOnManualRollingUpdateFailure", true);
}

@Test
public void testManualPodRollingUpdateWithPodSetsWithError3(VertxTestContext context) {
testManualPodRollingUpdateWithPodSetsWithErrorConditions(context, "+ContinueReconciliationOnManualRollingUpdateFailure", false);
testManualPodRollingUpdateWithPodSetsWithErrorConditions(context, "", false);
}

private void testManualPodRollingUpdateWithPodSetsWithErrorConditions(VertxTestContext context, String featureGates, boolean expectError) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -433,25 +433,25 @@ vertx, new PlatformFeaturesAvailability(false, KUBERNETES_VERSION),
@Test
public void testManualPodRollingUpdateWithPodSetsWithError1(VertxTestContext context) {
testManualPodRollingUpdateWithPodSetsWithErrorConditions(
context, false, true, "", true);
context, false, true, "-ContinueReconciliationOnManualRollingUpdateFailure", true);
}

@Test
public void testManualPodRollingUpdateWithPodSetsWithError2(VertxTestContext context) {
testManualPodRollingUpdateWithPodSetsWithErrorConditions(
context, true, false, "", true);
context, true, false, "-ContinueReconciliationOnManualRollingUpdateFailure", true);
}

@Test
public void testManualPodRollingUpdateWithPodSetsWithError3(VertxTestContext context) {
testManualPodRollingUpdateWithPodSetsWithErrorConditions(
context, false, true, "+ContinueReconciliationOnManualRollingUpdateFailure", false);
context, false, true, "", false);
}

@Test
public void testManualPodRollingUpdateWithPodSetsWithError4(VertxTestContext context) {
testManualPodRollingUpdateWithPodSetsWithErrorConditions(
context, true, false, "+ContinueReconciliationOnManualRollingUpdateFailure", false);
context, true, false, "", false);
}

private void testManualPodRollingUpdateWithPodSetsWithErrorConditions(VertxTestContext context,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2230,11 +2230,12 @@ public void testFailingManualRollingUpdate(VertxTestContext context) {
ArgumentCaptor<KafkaConnect> connectCaptor = ArgumentCaptor.forClass(KafkaConnect.class);
when(mockConnectOps.updateStatusAsync(any(), connectCaptor.capture())).thenReturn(Future.succeededFuture());

ClusterOperatorConfig coConfig = new ClusterOperatorConfig.ClusterOperatorConfigBuilder(ResourceUtils.dummyClusterOperatorConfig(), VERSIONS).with(ClusterOperatorConfig.FEATURE_GATES.key(), "-ContinueReconciliationOnManualRollingUpdateFailure").build();
KafkaConnectAssemblyOperator ops = new KafkaConnectAssemblyOperator(
vertx,
new PlatformFeaturesAvailability(false, KUBERNETES_VERSION),
supplier,
ResourceUtils.dummyClusterOperatorConfig()
coConfig
);

Checkpoint async = context.checkpoint();
Expand Down Expand Up @@ -2320,12 +2321,11 @@ public void testManualRollingUpdateWithSuppressedFailure(VertxTestContext contex
ArgumentCaptor<KafkaConnect> connectCaptor = ArgumentCaptor.forClass(KafkaConnect.class);
when(mockConnectOps.updateStatusAsync(any(), connectCaptor.capture())).thenReturn(Future.succeededFuture());

ClusterOperatorConfig coConfig = new ClusterOperatorConfig.ClusterOperatorConfigBuilder(ResourceUtils.dummyClusterOperatorConfig(), VERSIONS).with(ClusterOperatorConfig.FEATURE_GATES.key(), "+ContinueReconciliationOnManualRollingUpdateFailure").build();
KafkaConnectAssemblyOperator ops = new KafkaConnectAssemblyOperator(
vertx,
new PlatformFeaturesAvailability(false, KUBERNETES_VERSION),
supplier,
coConfig
ResourceUtils.dummyClusterOperatorConfig()
);

Checkpoint async = context.checkpoint();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1523,11 +1523,12 @@ public void testFailingManualRollingUpdate(VertxTestContext context) {
when(mockConnectClient.list(any(), anyString(), anyInt())).thenReturn(Future.succeededFuture(emptyList()));
when(mockConnectClient.updateConnectLoggers(any(), anyString(), anyInt(), anyString(), any(OrderedProperties.class))).thenReturn(Future.succeededFuture());

ClusterOperatorConfig coConfig = new ClusterOperatorConfig.ClusterOperatorConfigBuilder(ResourceUtils.dummyClusterOperatorConfig(), VERSIONS).with(ClusterOperatorConfig.FEATURE_GATES.key(), "-ContinueReconciliationOnManualRollingUpdateFailure").build();
KafkaMirrorMaker2AssemblyOperator ops = new KafkaMirrorMaker2AssemblyOperator(
vertx,
new PlatformFeaturesAvailability(false, KUBERNETES_VERSION),
supplier,
ResourceUtils.dummyClusterOperatorConfig(),
coConfig,
x -> mockConnectClient
);

Expand Down Expand Up @@ -1619,12 +1620,11 @@ public void testManualRollingUpdateWithSuppressedFailure(VertxTestContext contex
when(mockConnectClient.list(any(), anyString(), anyInt())).thenReturn(Future.succeededFuture(emptyList()));
when(mockConnectClient.updateConnectLoggers(any(), anyString(), anyInt(), anyString(), any(OrderedProperties.class))).thenReturn(Future.succeededFuture());

ClusterOperatorConfig coConfig = new ClusterOperatorConfig.ClusterOperatorConfigBuilder(ResourceUtils.dummyClusterOperatorConfig(), VERSIONS).with(ClusterOperatorConfig.FEATURE_GATES.key(), "+ContinueReconciliationOnManualRollingUpdateFailure").build();
KafkaMirrorMaker2AssemblyOperator ops = new KafkaMirrorMaker2AssemblyOperator(
vertx,
new PlatformFeaturesAvailability(false, KUBERNETES_VERSION),
supplier,
coConfig,
ResourceUtils.dummyClusterOperatorConfig(),
x -> mockConnectClient
);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,6 @@ kubectl annotate pod <cluster_name>-mirrormaker2-<index_number> strimzi.io/manua
A rolling update of the annotated `Pod` is triggered, as long as the annotation was detected by the reconciliation process.
When the rolling update of a pod is complete, the annotation is automatically removed from the `Pod`.

NOTE: If the `ContinueReconciliationOnManualRollingUpdateFailure` feature gate is enabled, reconciliation continues even if the manual rolling update of the cluster fails.
NOTE: Unless the `ContinueReconciliationOnManualRollingUpdateFailure` feature gate is disabled, reconciliation continues even if the manual rolling update of the cluster fails.
This allows the Cluster Operator to recover from certain rectifiable situations that can be addressed later in the reconciliation.
For example, it can recreate a missing Persistent Volume Claim (PVC) or Persistent Volume (PV) that caused the update to fail.
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ Alpha and beta stage features are removed if they do not prove to be useful.
* The `UseKRaft` feature gate moved to GA stage in Strimzi 0.42.
It is now permanently enabled and cannot be disabled.
To use KRaft (ZooKeeper-less Apache Kafka), you still need to use the `strimzi.io/kraft: enabled` annotation on the `Kafka` custom resources or migrate from an existing ZooKeeper-based cluster.
* The `ContinueReconciliationOnManualRollingUpdateFailure` feature was introduced in Strimzi 0.41 and is disabled by default.
* The `ContinueReconciliationOnManualRollingUpdateFailure` feature was introduced in Strimzi 0.41 and moved to beta stage in Strimzi 0.44.0.
It is now enabled by default, but can be disabled if needed.

NOTE: Feature gates might be removed when they reach GA. This means that the feature was incorporated into the Strimzi core features and can no longer be disabled.

Expand Down Expand Up @@ -80,8 +81,8 @@ NOTE: Feature gates might be removed when they reach GA. This means that the fea

¦`ContinueReconciliationOnManualRollingUpdateFailure`
¦0.41
¦0.43 (planned)
¦n/a
¦0.44
¦0.47 (planned)

|===

Expand Down
Loading

0 comments on commit 36f1109

Please sign in to comment.