Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat 6.2.1 massive DOS on Kubernetes Apiserver #6536

Closed
towolf opened this issue Mar 12, 2018 · 11 comments
Closed

Filebeat 6.2.1 massive DOS on Kubernetes Apiserver #6536

towolf opened this issue Mar 12, 2018 · 11 comments

Comments

@towolf
Copy link

towolf commented Mar 12, 2018

Today, for no particular visible reason, one filebeat instance started a request DOS on our 3 apiservers.

image

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1beta1",
  "metadata": {
    "creationTimestamp": "2018-03-12T13:44:46Z"
  },
  "level": "Request",
  "timestamp": "2018-03-12T13:44:46Z",
  "auditID": "6fe10401-ceb8-401a-90ed-703c667b7baf",
  "stage": "ResponseStarted",
  "requestURI": "/api/v1/pods?fieldSelector=spec.nodeName%3Ds858&resourceVersion=35649380&watch=true",
  "verb": "watch",
  "user": {
    "username": "system:serviceaccount:kube-system:filebeat",
    "uid": "788801b4-f61b-11e7-8071-00259008c682",
    "groups": [
      "system:serviceaccounts",
      "system:serviceaccounts:kube-system",
      "system:authenticated"
    ]
  },
  "sourceIPs": [
    "10.216.0.16"
  ],
  "objectRef": {
    "resource": "pods",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "metadata": {},
    "code": 200
  },
  "requestReceivedTimestamp": "2018-03-12T13:44:46.085940Z",
  "stageTimestamp": "2018-03-12T13:44:46.086197Z"
}

The requestURI was the same throughout this episode:
"/api/v1/pods?fieldSelector=spec.nodeName%3Ds858&resourceVersion=35649380&watch=true"

Millions of times. The same request from the same filebeat instance.

For confirmed bugs, please report:

  • Version: 6.2.1
  • Operating System: Ubuntu 16.04 amd64 / Kubernetes v1.9.1
  • Steps to Reproduce: not sure
@towolf
Copy link
Author

towolf commented Mar 12, 2018

At some point this took out one of the masters in the cluster. So, not pretty at all.

@exekias
Copy link
Contributor

exekias commented Mar 12, 2018

Hi Tobias, I'm afraid you hit #6503, this issue has been fixed and next release (6.2.3) will include the fix.

@exekias
Copy link
Contributor

exekias commented Mar 13, 2018

I forgot to mention that we have introduced backoff mechanism to our Kubernetes watcher code that should prevent this from happening again with any other future error.

For convenience, I've pushed a snapshot build with the fix: exekias/filebeat:6.2.3-snapshot

I'm closing this as it's a fixed issue

@exekias exekias closed this as completed Mar 13, 2018
@yawboateng
Copy link

@exekias thanks for the snapshot build. when should we expect the 6.2.3 release? also have the same issue with masters going offline

@exekias
Copy link
Contributor

exekias commented Mar 13, 2018

We don't publish release dates, but release often. You can expect 6.2.3 in the following week(s)

@towolf
Copy link
Author

towolf commented Mar 13, 2018

Thanks for the info @exekias

I switched to your snapshot, because it cannot get worse.

Also wondering what the ETA of the release is ...

@exekias
Copy link
Contributor

exekias commented Mar 13, 2018

I'm sorry for the inconvenience, thank you for your reports

@yawboateng
Copy link

Can confirm the issue is resolved in the 6.2.3 snapshot build from @exekias. running for 3 days now with no issues 👍

@towolf
Copy link
Author

towolf commented Mar 16, 2018

Same, 0 Restarts on all our nodes, and no more infinite request spam.

Great work.

adriansr added a commit to adriansr/beats that referenced this issue Mar 17, 2018
Fixes packetbeat termination problems with both af_packet and pcap
captures.

Fixes elastic#6536
@exekias
Copy link
Contributor

exekias commented Mar 20, 2018

An update on this, new images for 6.2.3 are already available: https://www.docker.elastic.co/

@towolf
Copy link
Author

towolf commented Mar 20, 2018

Thanks for the notice. I've bumped the daemonset to the public release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants