introduce USDT for the perf tracing #1532

gangxie112 · 2024-12-06T03:16:08Z

Hello,

I find we added some some tracing point of the profiler in recenty release. But it's not very practical for online/training system.
So do we have any plan to introduce some other mature tools, like USDT/ebpf? The change and the performance impact is very small. Only concern is that when doing the probe, it traps into kernel due to int3. Not sure the performance impact under edr NIC, like 400Gb. I tested it with my 40Gb nic and 3060 gpu, no performance downgrade is observed.

Thanks,

sjeaugey · 2024-12-06T08:18:54Z

Hi,

The networking aspect in NCCL is far from being our only focus. NCCL bridges together NVLink technologies and networking technologies through the use of CUDA kernels running on the GPU, and the profiler is here to allow people to see what happens in the different NCCL layers. But it's not a network analysis tool. We're looking into adding profiler entry point in the network plugins, but it's not there yet.

Also, I'm not very familiar with USDT/eBPF but it doesn't look like it was developed for high speed RDMA NICs which don't even go through kernel space to send over the network. Looks like it was done for TCP/IP traffic.

In any case, if you have some insight as to how we could improve NCCL to make it compatible with some other tools, feel free to let us know. We can consider it when we improve our profiling capabilities.

gangxie112 · 2024-12-09T01:42:54Z

Hi,

The networking aspect in NCCL is far from being our only focus. NCCL bridges together NVLink technologies and networking technologies through the use of CUDA kernels running on the GPU, and the profiler is here to allow people to see what happens in the different NCCL layers. But it's not a network analysis tool. We're looking into adding profiler entry point in the network plugins, but it's not there yet.

Also, I'm not very familiar with USDT/eBPF but it doesn't look like it was developed for high speed RDMA NICs which don't even go through kernel space to send over the network. Looks like it was done for TCP/IP traffic.

In any case, if you have some insight as to how we could improve NCCL to make it compatible with some other tools, feel free to let us know. We can consider it when we improve our profiling capabilities.

No, with ebpf, there is no difference of the rdma traffic. The ebpf probe is just a NOP instruction. And after it's activated, Nop is replaced with INT3 and it traps into the kernel to execute the ebpf program. With this, we can trace the user level code, not the rdma driver.

sjeaugey · 2024-12-09T08:09:45Z

Oh, but then every trace point would be an interrupt which would causes a switch to kernel mode. That seems very slow -- relative to the kind of message rate and latency we have with RDMA operations.

sjeaugey · 2024-12-09T09:38:29Z

Actually, depending on the CPU type, you may also need to pass iommu=pt as a kernel boot option.

gangxie112 · 2024-12-11T06:01:47Z

Oh, but then every trace point would be an interrupt which would causes a switch to kernel mode. That seems very slow -- relative to the kind of message rate and latency we have with RDMA operations.

Yes， that's what all I concern. I'm doing some search about user level bpf, like bpftime. But it seems that it's hard to be integrated.
On the other hand, if we just hook the uprobe on the control path, like proxy ops, I think it may be acceptable. Because when we want to probe the nccl process, there most likely be some significant performance downgrade. I will do some performance tests against this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce USDT for the perf tracing #1532

introduce USDT for the perf tracing #1532

gangxie112 commented Dec 6, 2024

sjeaugey commented Dec 6, 2024

gangxie112 commented Dec 9, 2024

sjeaugey commented Dec 9, 2024

sjeaugey commented Dec 9, 2024

gangxie112 commented Dec 11, 2024 •

edited

Loading

introduce USDT for the perf tracing #1532

introduce USDT for the perf tracing #1532

Comments

gangxie112 commented Dec 6, 2024

sjeaugey commented Dec 6, 2024

gangxie112 commented Dec 9, 2024

sjeaugey commented Dec 9, 2024

sjeaugey commented Dec 9, 2024

gangxie112 commented Dec 11, 2024 • edited Loading

gangxie112 commented Dec 11, 2024 •

edited

Loading