-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eBPF event loop blockage finder #569
Comments
Here is an PoC to show what I mean: https://git.sr.ht/~kvakil/fast-blocked-at/tree |
For
Yes, that's a good improvement indeed. Basically, in our last WG Meeting, I said to wait a while regarding the eBPF support, because I'm definitely interested in helping on that. Unfortunately, I had no time recently to investigate. But, soon as possible I'll try to collect what Node.js would need to support it. Feel free to investigate too, any help is welcome! I'd say, once it doesn't hurt the Node.js Performance, it will be definitely accepted by the team. |
I have written a demo by ebpf before to count the cost of every phase of libuv. I think ebpf is very useful for solving problem. |
It seems private. 404. |
Thanks for the confirmation that this approach has been useful to
I appreciate the insight. I think that the next step here is to |
Yes, i think this is the problem of uprobe/kprobe that the name of function may change. I'm looking forward to seeing Libuv support these capabilities(trace the event loop). |
It turns out Node.js recently removed the dtrace probes: |
Anyway, I'll investigate it. For further updates, subscribe to: #535 |
Actually I want to do this job as well,the "metrics" API seems to have some overhead on additional "epoll_wait", libuv-issues-3937. It is a great idea to use ebpf to instead it.But i am not sure how much work we need to do. |
Hi -- I'd like to share an eBPF use-case which I found quite useful in
my day job. (Please let me know if there is a better forum to do this.)
We were experiencing long (10s+) event loop blockages which was
affecting our performance. We were alerted to this issue by
node-blocked.
We had two initial ideas:
blockages only happened rarely.
async_hooks
(specifically blocked-at): but the overhead wasunacceptable.
The solution we landed on used eBPF, particularly bpftrace:
We ran this script on a bunch of machines, and eventually it spit out a
coredump. We opened the coredump with llnode and found the cause
via
v8 backtrace
.Questions for this group
I will also create a separate issue about how llnode is no longer
supported, but I think this is still useful functionality. For
example, you can use it to get histograms of event loop blockages
which is independently useful for workload characterization.
On Node's side, it would be nicer if event loop stages were exposed as
stable tracepoints instead of uprobes. This would make it easier for
people to package similar tools.
One could also imagine a weaker version of this functionality being
built-in to NodeJS: collecting a Javascript backtrace when the event
loop has currently been blocked for too long. From talking with other
engineers, I've heard that attributing event loop blockages is a
common problem when running NodeJS at scale. Is there interest in
having this in NodeJS core?
The text was updated successfully, but these errors were encountered: