Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat/autodiscover ES_data_should_pass_validations fails in 8.9.0-SNAPSHOT #6946

Closed
thbkrkr opened this issue Jun 22, 2023 · 6 comments
Closed
Labels
>bug Something isn't working >docs Documentation >test Related to unit/integration/e2e tests v2.9.0

Comments

@thbkrkr
Copy link
Contributor

thbkrkr commented Jun 22, 2023

cloud-on-k8s-operator-nightly/builds/217@stack-8.9.0-SNAPSHOT

  • TestFilebeatDefaultConfig/ES_data_should_pass_validations
  • TestBeatSecureSettings/ES_data_should_pass_validations
  • TestBeatConfigRef/ES_data_should_pass_validations
--- FAIL: TestFilebeatDefaultConfig (1887.47s)
    --- FAIL: TestFilebeatDefaultConfig/ES_data_should_pass_validations (1800.00s)
                Error Trace:    /go/src/github.com/elastic/cloud-on-k8s/test/e2e/test/utils.go:94
                Error:          Received unexpected error:
                                hit count should be more than 0 for /*beat*/_search?q=agent.type:filebeat

--- FAIL: TestBeatSecureSettings (1888.35s)
    --- FAIL: TestBeatSecureSettings/ES_data_should_pass_validations (1800.00s)
                Error Trace:    /go/src/github.com/elastic/cloud-on-k8s/test/e2e/test/utils.go:94
                Error:          Received unexpected error:
                                hit count should be more than 0 for /*beat*/_search?q=agent.type:filebeat

--- FAIL: TestBeatConfigRef (1896.88s)
    --- FAIL: TestBeatConfigRef/ES_data_should_pass_validations (1800.00s)
                Error Trace:    /go/src/github.com/elastic/cloud-on-k8s/test/e2e/test/utils.go:94
                Error:          Received unexpected error:
                                hit count should be more than 0 for /*beat*/_search?q=agent.type:filebeat
@thbkrkr thbkrkr added >bug Something isn't working >test Related to unit/integration/e2e tests labels Jun 22, 2023
@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jun 27, 2023

Running TestFilebeatDefaultConfig locally I see missing RBAC permissions to use Filebeat with autodiscover, not sure if this is the root cause:

E0627 12:15:14.396909       7 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.4/tools/cache/reflector.go:167: 
  Failed to watch *v1.ReplicaSet: failed to list *v1.ReplicaSet: replicasets.apps is forbidden: 
  User "system:serviceaccount:e2e-mercury:test-fb-default-cfg-d26k-sa" cannot list resource "replicasets" in
  API group "apps" at the cluster scope

After fixing it, I got:

E0627 12:30:27.699959       7 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.4/tools/cache/reflector.go:167: 
  Failed to watch *v1.Job: failed to list *v1.Job: jobs.batch is forbidden:
  User "system:serviceaccount:e2e-mercury:test-fb-default-cfg-n9tn-sa" cannot list resource "jobs" in 
  API group "batch" at the cluster scope

After fixing it, it looks good.

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jun 30, 2023

I asked the Beats team to confirm if this change is normal.

@thbkrkr thbkrkr changed the title ES_data_should_pass_validations fails in 8.9.0-SNAPSHOT Filebeat/autodiscover ES_data_should_pass_validations fails in 8.9.0-SNAPSHOT Jun 30, 2023
@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jul 4, 2023

I got confirmed that the change is normal. These permissions are required since a very long time.

However, a recent change makes that when the permissions are not set, the errors are propagated and that is why we only see them now.

Going back in time and following the movement of the source code, I think these permissions are required since:

We need to update our manifests with Filebeat using autodiscover in config/recipes/beats.

@thbkrkr thbkrkr added >docs Documentation v2.9.0 labels Jul 4, 2023
@mmentges
Copy link

mmentges commented Aug 1, 2023

Ran into this issue after upgrading our ECK Test environment from 8.6.2 to 8.9.0 on an Azure Redhat Openshift Cluster 4.11

failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:serviceaccount:elastic:filebeat" cannot list resource "replicasets" in API group "apps" at the cluster scope
failed to list *v1.Job: jobs.batch is forbidden: User "system:serviceaccount:elastic:filebeat" cannot list resource "jobs" in API group "batch" at the cluster scope

Modified the cluster role, otherwise filebeat stopped working completely

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  - nodes
  verbs:
  - get
  - watch
  - list
 ###New Rules below
- apiGroups: ["apps"]
  resources: 
  - replicasets
  verbs: 
  - list
  - watch
- apiGroups: ["batch"]
  resources: 
  - jobs
  verbs: 
  - list
  - watch

pebrc added a commit that referenced this issue Aug 5, 2023
Fixes #7079

This was first reported in #6946 (comment)
pebrc added a commit to pebrc/cloud-on-k8s that referenced this issue Aug 5, 2023
Fixes elastic#7079

This was first reported in elastic#6946 (comment)

(cherry picked from commit d022e10)
pebrc added a commit that referenced this issue Aug 6, 2023
Fixes #7079

This was first reported in #6946 (comment)

(cherry picked from commit d022e10)
@thbkrkr
Copy link
Contributor Author

thbkrkr commented Sep 8, 2023

@barkbay
Copy link
Contributor

barkbay commented Sep 20, 2023

I'm closing this one as I think this particular issue is fixed (feel free to reopen if I'm wrong)
New failures are tracked in #7172

@barkbay barkbay closed this as completed Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working >docs Documentation >test Related to unit/integration/e2e tests v2.9.0
Projects
None yet
Development

No branches or pull requests

3 participants