-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logstash - add ability to reload pipeline(s) without triggering full pod restart #6674
Logstash - add ability to reload pipeline(s) without triggering full pod restart #6674
Conversation
@robbavey I tried to run logstash.yaml sample and update the Secret |
@kaisecheng It takes a surprisingly long time - I've seen it take up to 3 minutes for the change to get picked up. For manual testing, I typically make and apply the change then watch the pod logs. |
This could be due to the kubelets sync interval which is 60 seconds plus some jitter and I guess there is probably also some delay on the Logstash side before the change is detected in the filesystem. We use a trick for Elasticsearch to force an immediate sync for the secrets where timely sync is important: we annotate the Pods (we use a timestamp but the actual annotation is irrelevant) which forces a sync. I am not sure if pipeline changes need such urgent propagation to warrant such extra complexity but I thought I mention it:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of the pipeline reload issue. It works as expected. I have tested pipeline updates through pipelines
and pipelinesRef
. Both take 1 ~ 2 minutes locally to reflect the change. The e2e test looks good. I run it with export E2E_TAGS=e2e; make e2e-local TESTS_MATCH=TestPipelineConfigLogstash
.
The only question on my mind is whether we should enable config.reload.automatic: true
in ECK by default, because it is not enabled in docker nor any distribution
@kaisecheng It's a good question, and worth discussing - I think when a change is made to a
My thoughts here are that a change to a definition in the CRD, such as a change to a This change allows that change to be acted upon with as little disruption as we can get away with - we can now limit the change to pipeline(s), rather than restarting the whole pod. If we don't set But, let's discuss |
Agree that having pipeline auto-reload is a better experience. Sadly, it is not the default in Logstash. When Users may expect pod restart when pipeline changes, just like changing in I am still not sure whether changing the default reload behavior only in ECK is right, but open to this change. |
@barkbay @pebrc Ready for review by ECK team. I implemented the suggested optimization to speed up pipeline loading - thanks for the tip! @kaisecheng and I discussed the |
buildkite test this -f p=gke,t=TestLogstashPipelineReload -m s=8.7.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
) | ||
}, | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff --git a/pkg/controller/common/reconciler/secret.go b/pkg/controller/common/reconciler/secret.go
index 0b6026f87..50004fd80 100644
--- a/pkg/controller/common/reconciler/secret.go
+++ b/pkg/controller/common/reconciler/secret.go
@@ -30,11 +30,17 @@ const (
SoftOwnerKindLabel = "eck.k8s.elastic.co/owner-kind"
)
+func WithPostUpdate(f func()) func(p *Params) {
+ return func(p *Params) {
+ p.PostUpdate = f
+ }
+}
+
// ReconcileSecret creates or updates the actual secret to match the expected one.
// Existing annotations or labels that are not expected are preserved.
-func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, owner client.Object) (corev1.Secret, error) {
+func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, owner client.Object, opts ...func(*Params)) (corev1.Secret, error) {
var reconciled corev1.Secret
- if err := ReconcileResource(Params{
+ params := Params{
Context: ctx,
Client: c,
Owner: owner,
@@ -54,7 +60,11 @@ func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret,
reconciled.Annotations = maps.Merge(reconciled.Annotations, expected.Annotations)
reconciled.Data = expected.Data
},
- }); err != nil {
+ }
+ for _, opt := range opts {
+ opt(¶ms)
+ }
+ if err := ReconcileResource(params); err != nil {
return corev1.Secret{}, err
}
return reconciled, nil
diff --git a/pkg/controller/logstash/pipeline.go b/pkg/controller/logstash/pipeline.go
index 6cbfee388..447ed7b8b 100644
--- a/pkg/controller/logstash/pipeline.go
+++ b/pkg/controller/logstash/pipeline.go
@@ -41,7 +41,13 @@ func reconcilePipeline(params Params) error {
},
}
- if err := reconcileSecretWithFastUpdate(params, expected); err != nil {
+ if _, err := reconciler.ReconcileSecret(params.Context, params.Client, expected, ¶ms.Logstash,
+ reconciler.WithPostUpdate(func() {
+ annotation.MarkPodsAsUpdated(params.Context, params.Client,
+ client.InNamespace(params.Logstash.Namespace),
+ NewLabelSelectorForLogstash(params.Logstash),
+ )
+ })); err != nil {
return err
}
return nil
If we want to reuse the existing secret reconciliation we could add a slice of option functions at the end
Co-authored-by: Peter Brachwitz <peter.brachwitz@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn, I had reviewed this two days ago but didn't submit it sorry.
Looks good! I spotted a few minor things, nothing blocking to merge.
pkg/controller/logstash/pipeline.go
Outdated
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/labels" | ||
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/reconciler" | ||
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/tracing" | ||
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/logstash/pipelines" | ||
|
||
"github.com/elastic/cloud-on-k8s/v2/pkg/utils/maps" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: group imports (max 3 groups: stdlib / external deps / internal deps).
pkg/controller/logstash/pipeline.go
Outdated
// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated | ||
// to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated | |
// to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in | |
// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated | |
// to trigger a quicker reload of the updated secret rather than waiting for the kubelet sync to kick in. |
// We intentionally DO NOT pass the configHash here. We don't want to consider the pipeline definitions in the | ||
// hash of the config to ensure that a pipeline change does not automatically trigger a restart | ||
// of the pod, but allows Logstash's automatic reload of pipelines to take place | ||
if err := reconcilePipeline(params); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pass the configHash
when config.reload.automatic
equals false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good question. I'm erring on the side of 'no' at the moment, but I think this is something that could change after the technical preview depending on feedback.
My reasoning on this is that the false
(default) value of non-k8s logstash doesn't react to pipeline changes at all, and to change this semantic to restart logstash completely on pipeline changes feels like very different behaviour.
Thinking about how we could add flexibility, I wonder if we might want to introduce something for ECK here, along the lines of:
config.reload.restart_policy: detected_only|all|none
, which would either set config.reload.automatic: true
for detected_only
, and false
for all
or none
, passing the configHash
if the value is all
, and not if it is none
.
cc @flexitrev, @roaksoax, @jsvd
Co-authored-by: Thibault Richard <thbkrkr@users.noreply.github.com>
This commit adds the ability to reload logstash pipelines when the pipeline changes, without triggering a full restart of the pod, leveraging Logstash's ability to watch pipeline definitions, and reload automatically if a change is discovered.
A logstash config directory typically includes
logstash.yml
,pipelines.yml
,jvm.options
andlog4j2.properties
required to run logstash - while logstash can store the contents of a pipeline definition in any location, thepipelines.yml
definition file, which states where these definition files are must be in the same config directory as the other setup files.To enables us to have a mixture of copied and generated files in this config diretory, an
initContainer
is used, with a small script to prepare the config directory./usr/share/logstash/config
into the a shared config volumepipelines.yml
andlogstash.yml
secrets created by the logstash operator.Additionally, we now do not include changes to pipelines in the configuration hash that triggers a pod reload, but instead write the
config.reload.automatic: true
setting in tologstash.yml
Note that triggering a reload is not immediate - there may be a delay measured in minutes between the pipeline definition being changed, and that being reflected in a pipeline reload.