-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add_process_metadata: container id: Do not cache whole cgroup file for each process #17351
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
Hello! @exekias , @BrookHF, @danmx Firs one is that current solution in master caches huge amount of useless data - whole cgroup file content for all processes. Also it always processes that cached cgroup file content again and again to ask cid even if data is cached. Here i changed cache just to cache container id. Second issue i discovered was that some kubernetes cgroup paths have suffix after cid. One i found was kube-proxy. For now i hardcoded it there and it works. My question is do you happen to know any other corner cases like that? Maybe I should add suffix also configurable.... With best regards! |
Pinging @elastic/integrations-platforms (Team:Platforms) |
Thank you for opening this!
Do you know what's the general rule, or why that one behaves differenty? I'm guessing it's because it's an static pod, managed in a different way. We should find the rule for these and include a generic solution |
Completely agree with generic solution and that's what i'm planning to do. Finding out the rule needs further investigations, that's why I opened discussion here too, maybe some of you knows about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good to me, thank you for contributing! let's split concerns in different PRs so this can move forward
This won't need a changelog as it will be released with the previous PR
libbeat/processors/add_process_metadata/gosigar_cid_provider.go
Outdated
Show resolved
Hide resolved
jenkins test this please |
Co-Authored-By: Carlos Pérez-Aradros Herce <exekias@gmail.com>
ok to test |
@exekias until i do not know exact rule for cgroup format (and it might not be written anywhere in balck and white, so we could search it forever :)), i see following options:
What do you think? |
I'm ok with adding custom patterns to the |
Ok i'm happy to add regex pattern as configurable alternative in separate PR. My initial wording using "skips" was not the best. As it is returning basepath, as cid, then it returns 'kube-proxy' |
Yes, I really expect we can investigate this a little more and come up with the generic solution 😄 @ChrsMark do you know how the kube-proxy cgroup name is generated by a chance? We can research this a little bit |
@exekias is there anything else i need to do for completing this PR? Or from now on all automatic? |
…r each process (elastic#17351) * Do not cache whole cgroup file for each process. Include kube-proxy cid too. Co-authored-by: Jako Tinkus <jatinkus@microsoft.com> (cherry picked from commit 8a47788)
…r each process (elastic#17351) * Do not cache whole cgroup file for each process. Include kube-proxy cid too. Co-authored-by: Jako Tinkus <jatinkus@microsoft.com> (cherry picked from commit 8a47788)
@exekias it is something like More details: Getting kube-proxy uid:~/$ kubectl -n kube-system get pod -l k8s-app=kube-proxy -o jsonpath="{.items[0].metadata.uid}"
42ae03d6-e2b8-4eb5-8a1b-47c3c5b257de Searching for kube-proxy containers:$ docker ps | grep kube-proxy
4f6454506d5d 7d54289267dc "/usr/local/bin/kube…" About an hour ago Up About an hour k8s_kube-proxy_kube-proxy-968n2_kube-system_42ae03d6-e2b8-4eb5-8a1b-47c3c5b257de_19
e2d8064b3bc8 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_kube-proxy-968n2_kube-system_42ae03d6-e2b8-4eb5-8a1b-47c3c5b257de_19 We see above that there are two containers one is the main container for kube-proxy and the extra one called Check cgroup resources:#### main container
$ cat /sys/fs/cgroup/memory/kubepods/besteffort/pod42ae03d6-e2b8-4eb5-8a1b-47c3c5b257de/4f6454506d5dc49d1a374c6e7fb39d4eea00efbe8c3a17cddb74f8341d5c667c/cgroup.procs
5660
#### pause container
$ cat /sys/fs/cgroup/memory/kubepods/besteffort/pod42ae03d6-e2b8-4eb5-8a1b-47c3c5b257de/e2d8064b3bc8f2d5ffa537dd6467303b39c0815765cc147fefa9292d4dfaf9ec/cgroup.procs
5403 Note that k8s will group pods' cgroups according to the QoS policies |
…r each process (elastic#17351) (elastic#17361) * Do not cache whole cgroup file for each process. Include kube-proxy cid too. Co-authored-by: Jako Tinkus <jatinkus@microsoft.com> (cherry picked from commit 76f4353) Co-authored-by: jtinkus <35308202+jtinkus@users.noreply.github.com>
What does this PR do?
Current solution caches whole content of cgroup file for each process. Let's cache only found container ids.
Also, current solution skips kube-proxy container processes, because cgroup path has suffix /kube-proxy.
Why is it important?
Caching such a pile of useless data has performance impact.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Related issues
Relates #15947
Use cases
Screenshots
Logs