-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Agent and Elastic stack in different namespaces. #7382
Allow Agent and Elastic stack in different namespaces. #7382
Conversation
Unit tests e2e tests Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Update and add comments to appropriate funcs/sections of code. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hash to rotate the agents on changes to the ES CA is missing, which means that the Agents won't pick up the new CA until the ECK operator restarts. We can remedy this by adding the hash to the association configuration annotation like this
diff --git a/pkg/apis/common/v1/association.go b/pkg/apis/common/v1/association.go
index 8bcfaa2f2..5ecefd67b 100644
--- a/pkg/apis/common/v1/association.go
+++ b/pkg/apis/common/v1/association.go
@@ -199,12 +199,13 @@ func FormatNameWithID(template string, id string) string {
// AssociationConf holds the association configuration of a referenced resource in an association.
type AssociationConf struct {
- AuthSecretName string `json:"authSecretName"`
- AuthSecretKey string `json:"authSecretKey"`
- IsServiceAccount bool `json:"isServiceAccount"`
- CACertProvided bool `json:"caCertProvided"`
- CASecretName string `json:"caSecretName"`
- URL string `json:"url"`
+ AuthSecretName string `json:"authSecretName"`
+ AuthSecretKey string `json:"authSecretKey"`
+ IsServiceAccount bool `json:"isServiceAccount"`
+ CACertProvided bool `json:"caCertProvided"`
+ CASecretName string `json:"caSecretName"`
+ AdditionalSecretsHash string `json:"additionalSecretsHash"`
+ URL string `json:"url"`
// Version of the referenced resource. If a version upgrade is in progress,
// matches the lowest running version. May be empty if unknown.
Version string `json:"version"`
diff --git a/pkg/controller/association/controller/agent_fleetserver.go b/pkg/controller/association/controller/agent_fleetserver.go
index 91474ecad..ba51acf49 100644
--- a/pkg/controller/association/controller/agent_fleetserver.go
+++ b/pkg/controller/association/controller/agent_fleetserver.go
@@ -17,6 +17,7 @@ import (
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/association"
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/operator"
"github.com/elastic/cloud-on-k8s/v2/pkg/utils/k8s"
+ ulog "github.com/elastic/cloud-on-k8s/v2/pkg/utils/log"
"github.com/elastic/cloud-on-k8s/v2/pkg/utils/rbac"
)
@@ -46,11 +47,12 @@ func AddAgentFleetServer(mgr manager.Manager, accessReviewer rbac.AccessReviewer
})
}
-func addtionalSecrets(c k8s.Client, assoc commonv1.Association) ([]types.NamespacedName, error) {
+func addtionalSecrets(ctx context.Context, c k8s.Client, assoc commonv1.Association) ([]types.NamespacedName, error) {
+ log := ulog.FromContext(ctx)
associated := assoc.Associated()
var agent agentv1alpha1.Agent
nsn := types.NamespacedName{Namespace: associated.GetNamespace(), Name: associated.GetName()}
- if err := c.Get(context.Background(), nsn, &agent); err != nil {
+ if err := c.Get(ctx, nsn, &agent); err != nil {
return nil, err
}
fleetServerRef := assoc.AssociationRef()
@@ -58,7 +60,7 @@ func addtionalSecrets(c k8s.Client, assoc commonv1.Association) ([]types.Namespa
return nil, nil
}
fleetServer := agentv1alpha1.Agent{}
- if err := c.Get(context.Background(), fleetServerRef.NamespacedName(), &fleetServer); err != nil {
+ if err := c.Get(ctx, fleetServerRef.NamespacedName(), &fleetServer); err != nil {
return nil, err
}
@@ -74,14 +76,17 @@ func addtionalSecrets(c k8s.Client, assoc commonv1.Association) ([]types.Namespa
// if both agent and ES are same namespace no copying needed
if agent.GetNamespace() == esAssociation.GetNamespace() {
+ log.V(1).Info("no additional secrets because same namespace")
return nil, nil
}
conf, err := esAssociation.AssociationConf()
if err != nil {
+ log.V(1).Info("no additional secrets because no assoc conf")
return nil, err
}
if conf == nil || !conf.CACertProvided {
+ log.V(1).Info("no additional secrets because conf nil or no CA provided")
return nil, nil
}
return []types.NamespacedName{{
diff --git a/pkg/controller/association/dynamic_watches.go b/pkg/controller/association/dynamic_watches.go
index 2245543dd..a7dabb3eb 100644
--- a/pkg/controller/association/dynamic_watches.go
+++ b/pkg/controller/association/dynamic_watches.go
@@ -5,6 +5,7 @@
package association
import (
+ "context"
"fmt"
"k8s.io/apimachinery/pkg/types"
@@ -49,7 +50,7 @@ func additionalSecretWatchName(associated types.NamespacedName) string {
// * if there's an ES user to create, watch the user Secret in ES namespace
// All watches for all given associations are set under the same watch name and replaced with each reconciliation.
// The given associations are expected to be of the same type (e.g. Kibana -> Elasticsearch, not Kibana -> Enterprise Search).
-func (r *Reconciler) reconcileWatches(associated types.NamespacedName, associations []commonv1.Association) error {
+func (r *Reconciler) reconcileWatches(ctx context.Context, associated types.NamespacedName, associations []commonv1.Association) error {
managedElasticRef := filterManagedElasticRef(associations)
unmanagedElasticRef := filterUnmanagedElasticRef(associations)
@@ -103,7 +104,7 @@ func (r *Reconciler) reconcileWatches(associated types.NamespacedName, associati
if err := ReconcileGenericWatch(associated, managedElasticRef, r.watches.Secrets, additionalSecretWatchName(associated), func() ([]types.NamespacedName, error) {
var toWatch []types.NamespacedName
for _, association := range associations {
- secs, err := r.AdditionalSecrets(r.Client, association)
+ secs, err := r.AdditionalSecrets(ctx, r.Client, association)
if err != nil {
return nil, err
}
diff --git a/pkg/controller/association/reconciler.go b/pkg/controller/association/reconciler.go
index c72298b49..a552873d2 100644
--- a/pkg/controller/association/reconciler.go
+++ b/pkg/controller/association/reconciler.go
@@ -7,6 +7,7 @@ package association
import (
"context"
"fmt"
+ "hash/fnv"
"reflect"
"time"
@@ -64,7 +65,7 @@ type AssociationInfo struct { //nolint:revive
// AdditionalSecrets are additional secrets to copy from an association's namespace to the associated resource namespace.
// Currently this is only used for copying the CA from an Elasticsearch association to the same namespace as
// an Agent referencing a Fleet Server.
- AdditionalSecrets func(c k8s.Client, assoc commonv1.Association) ([]types.NamespacedName, error)
+ AdditionalSecrets func(context.Context, k8s.Client, commonv1.Association) ([]types.NamespacedName, error)
// Labels are labels set on all resources created for association purpose. Note that some resources will be also
// labelled with AssociationResourceNameLabelName and AssociationResourceNamespaceLabelName in addition to any
// labels provided here.
@@ -191,7 +192,7 @@ func (r *Reconciler) Reconcile(ctx context.Context, request reconcile.Request) (
}
// reconcile watches for all associations of this type
- if err := r.reconcileWatches(associatedKey, associations); err != nil {
+ if err := r.reconcileWatches(ctx, associatedKey, associations); err != nil {
return reconcile.Result{}, tracing.CaptureError(ctx, err)
}
@@ -263,13 +264,14 @@ func (r *Reconciler) reconcileAssociation(ctx context.Context, association commo
return commonv1.AssociationPending, err // maybe not created yet
}
+ secretsHash := fnv.New32a()
if r.AdditionalSecrets != nil {
- additionalSecrets, err := r.AdditionalSecrets(r.Client, association)
+ additionalSecrets, err := r.AdditionalSecrets(ctx, r.Client, association)
if err != nil {
return commonv1.AssociationPending, err // maybe not created yet
}
for _, sec := range additionalSecrets {
- if err := copySecret(ctx, r.Client, r.AssociationInfo, k8s.ExtractNamespacedName(association.Associated()), association.GetNamespace(), sec); err != nil {
+ if err := copySecret(ctx, r.Client, secretsHash, k8s.ExtractNamespacedName(association.Associated()), association.GetNamespace(), sec); err != nil {
return commonv1.AssociationPending, err
}
}
@@ -294,10 +296,11 @@ func (r *Reconciler) reconcileAssociation(ctx context.Context, association commo
// construct the expected association configuration
expectedAssocConf := &commonv1.AssociationConf{
- CACertProvided: caSecret.CACertProvided,
- CASecretName: caSecret.Name,
- URL: url,
- Version: ver,
+ CACertProvided: caSecret.CACertProvided,
+ CASecretName: caSecret.Name,
+ AdditionalSecretsHash: fmt.Sprint(secretsHash.Sum32()),
+ URL: url,
+ Version: ver,
}
if r.ElasticsearchUserCreation == nil {
diff --git a/pkg/controller/association/reconciler_test.go b/pkg/controller/association/reconciler_test.go
index 678f47977..adcaab23a 100644
--- a/pkg/controller/association/reconciler_test.go
+++ b/pkg/controller/association/reconciler_test.go
@@ -341,7 +341,7 @@ func TestReconciler_Reconcile_NoESRef_Cleanup(t *testing.T) {
require.NotEmpty(t, kb.Annotations[kb.EsAssociation().AssociationConfAnnotationName()])
r := testReconciler(&kb, &kibanaUserInESNamespace, &kibanaUserInKibanaNamespace, &esCertsInKibanaNamespace)
// simulate watches being set
- require.NoError(t, r.reconcileWatches(k8s.ExtractNamespacedName(&kb), []commonv1.Association{kb.EsAssociation()}))
+ require.NoError(t, r.reconcileWatches(context.Background(), k8s.ExtractNamespacedName(&kb), []commonv1.Association{kb.EsAssociation()}))
require.NotEmpty(t, r.watches.Secrets.Registrations())
require.NotEmpty(t, r.watches.ReferencedResources.Registrations())
diff --git a/pkg/controller/association/secret.go b/pkg/controller/association/secret.go
index 455a6c7e9..92fea27ce 100644
--- a/pkg/controller/association/secret.go
+++ b/pkg/controller/association/secret.go
@@ -11,6 +11,7 @@ import (
"crypto/x509"
"encoding/json"
"fmt"
+ "hash"
"net/http"
corev1 "k8s.io/api/core/v1"
@@ -19,11 +20,11 @@ import (
commonv1 "github.com/elastic/cloud-on-k8s/v2/pkg/apis/common/v1"
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/certificates"
+ commonhash "github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/hash"
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/reconciler"
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/version"
"github.com/elastic/cloud-on-k8s/v2/pkg/controller/elasticsearch/client"
"github.com/elastic/cloud-on-k8s/v2/pkg/utils/k8s"
- "github.com/elastic/cloud-on-k8s/v2/pkg/utils/maps"
)
const (
@@ -194,16 +195,15 @@ func filterManagedElasticRef(associations []commonv1.Association) []commonv1.Ass
}
// copySecret will copy the source secret to the target namespace adding labels from the associated object to ensure garbage collection happens.
-func copySecret(ctx context.Context, client k8s.Client, info AssociationInfo, associated types.NamespacedName, targetNamespace string, source types.NamespacedName) error {
+func copySecret(ctx context.Context, client k8s.Client, secHash hash.Hash, associated types.NamespacedName, targetNamespace string, source types.NamespacedName) error {
var original corev1.Secret
if err := client.Get(ctx, source, &original); err != nil {
return err
}
var expected corev1.Secret
expected.Data = original.Data
- // We merge the original labels with the associated object's association labels to ensure
- // that this secret is garbage collected when the associated object is deleted.
- expected.Labels = maps.Merge(original.Labels, info.Labels(associated))
+ commonhash.WriteHashObject(secHash, original.Data)
+ expected.Labels = original.Labels
expected.Annotations = original.Annotations
expected.Name = original.Name
expected.Namespace = targetNamespace
Add additional unit test. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
expected.Name = original.Name | ||
expected.Namespace = targetNamespace | ||
|
||
_, err := reconciler.ReconcileSecret(ctx, client, expected, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if we could use a soft owner to garbage collect this Secret.
_, err := reconciler.ReconcileSecret(ctx, client, expected, nil) | |
_, err := reconciler.ReconcileSecretNoOwnerRef(ctx, client, expected, softOwner) |
softOwner
being referencedObj
IIUC, and maybe use GarbageCollectSoftOwnedSecrets
to remove them (not tested, only an idea).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think garbage collection is already tied to the association, in this case the Fleet Server->ES association. This has the odd side effect that the secret stays if the Agent->FleetServer association is severed but the Fleet Server->ES association still exists. I was thinking that this would be sufficient though because typically you would deploy Fleet server and Agent in tandem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the odd side effect that the secret stays if the Agent->FleetServer association is severed but the Fleet Server->ES association still exists.
Yes, this is what I was trying to "fix" with this suggestion. Not a strong opinion though, we can stick with the current behavior 👍
return commonv1.AssociationPending, err // maybe not created yet | ||
} | ||
for _, sec := range additionalSecrets { | ||
if err := copySecret(ctx, r.Client, secretsHash, association.GetNamespace(), sec); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we supposed to watch this Secret? It's not reconciled if I delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it doesn't return, but the agent-es
association controller is triggered on delete. I wonder if this:
https://github.com/naemono/cloud-on-k8s/blob/5dd7bb85550fd0d0b23af1762803025e725bf2d3/pkg/controller/association/dynamic_watches.go#L103-L117
is associating the secret incorrectly? I'll investigate and update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were only watching the source
secrets (the secrets we were copying into the target ns) and not the target/created
secrets. It seems we would need to watch both:
- if source secret changes, update target secret.
- if target secret is changed/deleted, reconcile the secret.
I have updated and verified that both the target and source are re-created upon deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed we have a similar issue with the fleet server es user/service token secret. Probably worth addressing in a separate PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realised we have captured my observation in an issue already #7170
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment, #7353 (review), still valid?
I think we also need to adjust the documentation, ECK 2.10.0 does not allow Agent and ES to run in different namespaces.
Update pipeline for memory exhaustion during lint. Review comments. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
…retsHash. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
buildkite test this -f p=gke,s=8.11.0 |
#7382 (comment) could still be addressed but otherwise LGTM |
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Updated to only include the additional secrets Hash if there are additional secrets. |
Fixing unit tests... |
…nal secrets exist. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Unit tests fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just the docs needs an update I think to call out Elasticsearch specifically
docs/orchestrating-elastic-stack-applications/agent-fleet.asciidoc
Outdated
Show resolved
Hide resolved
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
* Allow Agent, Fleet and Elastic stack in different namespaces. * Documentation updates for feature. --------- Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
resolves #7352
What does this change?
Both root and non-root, fleet and non-fleet Agent installations were being blocked from Agent running in a different namespace because of a legacy association check. This legacy check checked that the agent was running in the same namespace as Elasticsearch. The actual issue turned out to be when Agent is not running in the same namespace as Fleet server when fleet mode is enabled. This change enables Agent to run in any namespace regardless of where Fleet or Elasticsearch is deployed.
This supersedes #7353 as this turned out to be a much simpler changeset vs:
Changes:
Testing Done/Verification