Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with BPF packet scheduler #336

Open
FoRward-999 opened this issue Jan 16, 2023 · 36 comments
Open

Issues with BPF packet scheduler #336

FoRward-999 opened this issue Jan 16, 2023 · 36 comments
Labels
bpf bug sched packets scheduler

Comments

@FoRward-999
Copy link

Hello,

I'm currently interested in the packet scheduler using BPF and have done some experiments. However, I've run into some issues when using bpf_rr and bpf_red scheduler. There are three possibilities:

  1. The client keeps sending data to the server, but the server has received the expected amount of data.
  2. The application is stuck, and the ppid becomes 1, which can only be solved by restarting the system. (It seems that the scheduler tries to send data to an unestablished subflow at this time)
  3. RST might appear during data transmission.

Problem 1 usually occurs, sometimes problem 2 or 3.

However, using bpf_first scheduler doesn't cause any problems above.

Setup

My experiment setup is as follows:

Client

forward@forward-virtual-machine:~$ ip mptcp endpoint show
192.168.2.30 id 1 subflow fullmesh
192.168.2.29 id 2 subflow fullmesh
forward@forward-virtual-machine:~$ ip mptcp limits show
add_addr_accepted 1 subflows 3
forward@forward-virtual-machine:~$ hostnamectl
 Static hostname: forward-virtual-machine
       Icon name: computer-vm
         Chassis: vm
      Machine ID: c74952654b4c49f2b200fed4ca8eb8e1
         Boot ID: 8a19a42e95bf45178099968557309159
  Virtualization: vmware
Operating System: Ubuntu 22.04.1 LTS
          Kernel: Linux 6.1.0+
    Architecture: x86-64
 Hardware Vendor: VMware, Inc.
  Hardware Model: VMware Virtual Platform

Server

forward@forward-virtual-machine:~$ ip mptcp endpoint show
192.168.2.31 id 1 signal
192.168.2.32 id 2 signal
forward@forward-virtual-machine:~$ ip mptcp limits show
add_addr_accepted 1 subflows 3
forward@forward-virtual-machine:~$ hostnamectl
 Static hostname: forward-virtual-machine
       Icon name: computer-vm
         Chassis: vm
      Machine ID: 692e3a02614143b68f61aa642054ddcf
         Boot ID: 570c63dd032c4bd08c42be01322e79fb
  Virtualization: vmware
Operating System: Ubuntu 22.04.1 LTS
          Kernel: Linux 6.1.0+
    Architecture: x86-64
 Hardware Vendor: VMware, Inc.
  Hardware Model: VMware Virtual Platform

We tried to understand this problem more deeply through the BPF tracing tool.

Through stackcount, we found the following question:

sk_reset_timer is being called repeatedly.

We found this question very similar to the one mentioned in #52:

spurious retransmission and the mptcp_worker tries to update the MPTCP retransmission timer

I was wondering if this was the same issue or I made some mistakes?

I'm really open to disclose more information about the issue.
Thank you in advance.

@matttbe matttbe changed the title Issues with BPF pacekt scheduler Issues with BPF packet scheduler Jan 16, 2023
@matttbe matttbe added the bug label Jan 16, 2023
@matttbe
Copy link
Member

matttbe commented Jan 16, 2023

Hi @FoRward-999

Thank you for having reported these issues.

The BPF packets schedulers are under development, see #75. It is good to have some feedback!

The client keeps sending data to the server, but the server has received the expected amount of data.

Do you mean the data are being re-sent? Or the MPTCP data fin?
Do you have a packet trace by chance? (you need to zip it to share it on GitHub)
What kind of transfer were you doing? Can you share a bit more details about your test environment? (client / server programs, ran with which options, etc.)

The application is stuck, and the ppid becomes 1, which can only be solved by restarting the system. (It seems that the scheduler tries to send data to an unestablished subflow at this time)

Which application are you talking about here?

RST might appear during data transmission.

What other options are in this RST? A packet trace might be useful here.

(I guess you don't have these issues with the default (non-BPF) packets schedulers, right?)

sk_reset_timer is being called repeatedly.

Good question. @geliangtang (who created these BPF packets schedulers with reviews from @mjmartineau ), any ideas where this could come from? Do you mind looking at this issue please? :)

We found this question very similar to the one mentioned in #52:

spurious retransmission and the mptcp_worker tries to update the MPTCP retransmission timer

I was wondering if this was the same issue or I made some mistakes?

I'm not sure to see the link with #52 (MP_FAIL support). Are you talking about this commit? 64b9cea ("mptcp: fix spurious retransmissions") which contains #52 in the call trace because it was the 52nd build? Github likes to add links when there are # somewhere in the commit messages :)

This is quite old, I don't think this is linked.

@FoRward-999
Copy link
Author

Hello @matttbe, thank you for your reply.

First of all, sorry for the late response.

Do you mean the data are being re-sent? Or the MPTCP data fin?
Do you have a packet trace by chance? (you need to zip it to share it on GitHub)

Yes, the data are being re-sent.

The packet trace on the client side is linked here:
retransmission_rr_bug.pcapng.zip

It seems that after the subflows are established, the server returns TCP ACK instead of MPTCP ACK. We didn't spot this problem before.

What kind of transfer were you doing? Can you share a bit more details about your test environment? (client / server programs, ran with which options, etc.)

Which application are you talking about here?

We use a simple single-threaded loop sending program written by ourselves.

To avoid ambiguity, we did the same experiments using iperf3 with mptcpize.

On the server side, we did:

mptcpize run iperf3 -s

On the client side, we did:

mptcpize run iperf3 -c 192.168.2.31 -n 1GB

We encountered problem 2 and problem 3 which were mentioned before.

Problem 2:

Client side:

iperf3 is stuck, and the command ps -ef is shown below:

forward@forward-virtual-machine:~/CODING/WorkSpace/Scheduler$ ps -ef |grep iperf3
forward     2542       1 99 17:03 ?        00:03:02 iperf3 -c 192.168.2.31 -n 1GB
forward     2673    2336  0 17:06 pts/1    00:00:00 grep --color=auto iperf3

Server side:

forward@forward-virtual-machine:~$ mptcpize run iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.2.30, port 57528
[  5] local 192.168.2.31 port 5201 connected to 192.168.2.30 port 57542
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   128 KBytes  1.05 Mbits/sec
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec

The packet trace on the client side is linked here:
iperf3_rr_stuck.pcapng.zip

Problem 3:

Server side:

forward@forward-virtual-machine:~$ mptcpize run iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 0:200:4956:0:400:0:fe7f:0, port 0
iperf3: error - unable to receive control message: Connection reset by peer
-----------------------------------------------------------

The packet trace on the client side is linked here:
iperf3_rr_RST.pcapng.zip

According to packet traces above, the server returned TCP ACK instead of MPTCP ACK.

(I guess you don't have these issues with the default (non-BPF) packets schedulers, right?)

Yes, I don't have these issues with the default (non-BPF) packets schedulers. And the BPF packets schedulers run well when there is only one master flow.

Are you talking about this commit? 64b9cea ("mptcp: fix spurious retransmissions") which contains #52 in the call trace because it was the 52nd build?

Yes, this commit 64b9cea. Sorry for the confusion.

@matttbe
Copy link
Member

matttbe commented Feb 7, 2023

Hello @FoRward-999

Hello @matttbe, thank you for your reply.

First of all, sorry for the late response.

Sorry, my turn.

Do you mean the data are being re-sent? Or the MPTCP data fin?
Do you have a packet trace by chance? (you need to zip it to share it on GitHub)

Yes, the data are being re-sent.

The packet trace on the client side is linked here: retransmission_rr_bug.pcapng.zip

I had a quick look and in the traces:

  • the receiver only sends ACK without MPTCP options
  • the receiver sends a FIN after ~10 sec
  • the sender still has some data to send apparently
  • after ~1 second, the receiver sends a zero window, a bit like the application receiving the data stopped processing data (because the latency is very low, with GRO/TSO, a lot of data are quickly exchanged)
  • then the sender retries sending new data periodically ()

In other word (but maybe because I only quickly looked at the packet traces), it doesn't look like a packet scheduler issue. I don't understand why the receiver doesn't include the MPTCP options. Maybe a fallback has been done and the BPF packet scheduler are not handling that properly?

It could be good to look at the MPTCP counters with nstat.

I guess this was with your custom program, right? Maybe it is easy for you to track if the sender got the notification the receiver side has closed the connection? And check what the the receiver is doing after the close?

It seems that after the subflows are established, the server returns TCP ACK instead of MPTCP ACK. We didn't spot this problem before.

Indeed, surprising.

What kind of transfer were you doing? Can you share a bit more details about your test environment? (client / server programs, ran with which options, etc.)
Which application are you talking about here?

We use a simple single-threaded loop sending program written by ourselves.

To avoid ambiguity, we did the same experiments using iperf3 with mptcpize.

(...)

We encountered problem 2 and problem 3 which were mentioned before.

Problem 2:

Client side:

iperf3 is stuck, and the command ps -ef is shown below:

forward@forward-virtual-machine:~/CODING/WorkSpace/Scheduler$ ps -ef |grep iperf3
forward     2542       1 99 17:03 ?        00:03:02 iperf3 -c 192.168.2.31 -n 1GB
forward     2673    2336  0 17:06 pts/1    00:00:00 grep --color=auto iperf3

(...)

The packet trace on the client side is linked here: iperf3_rr_stuck.pcapng.zip

Here, we can see the MPTCP options in the ACKs.
There the sender stops for no visible reason (the client is not asking to stop).

@geliangtang did you experiment such issues when doing your tests?

Problem 3:

Server side:

forward@forward-virtual-machine:~$ mptcpize run iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 0:200:4956:0:400:0:fe7f:0, port 0
iperf3: error - unable to receive control message: Connection reset by peer
-----------------------------------------------------------

The packet trace on the client side is linked here: iperf3_rr_RST.pcapng.zip

According to packet traces above, the server returned TCP ACK instead of MPTCP ACK.

Indeed, strange. We might need nstat as well here.

(I guess you don't have these issues with the default (non-BPF) packets schedulers, right?)

Yes, I don't have these issues with the default (non-BPF) packets schedulers. And the BPF packets schedulers run well when there is only one master flow.

OK, good to know.

FYI, we ended up discussion more about the packet scheduler API (and the behaviour of the default one) and more changes are needed: #350 (quite a bit... if you are interested ;-) )

Are you talking about this commit? 64b9cea ("mptcp: fix spurious retransmissions") which contains #52 in the call trace because it was the 52nd build?

Yes, this commit 64b9cea. Sorry for the confusion.

👍

@FoRward-999
Copy link
Author

Maybe a fallback has been done and the BPF packet scheduler are not handling that properly?

It could be good to look at the MPTCP counters with nstat.

Yes, I think it's related to the BPF packet scheduler.

I did as you mentioned. Here is the result of the nstat:

#kernel
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtMPJoinSynRx             3                  0.0
MPTcpExtMPJoinAckRx             3                  0.0
MPTcpExtOFOQueueTail            2                  0.0
MPTcpExtOFOQueue                3                  0.0
MPTcpExtEchoAdd                 1                  0.0
MPTcpExtRcvPruned               77                 0.0
MPTcpExtRcvWndShared            1                  0.0

I guess this was with your custom program, right? Maybe it is easy for you to track if the sender got the notification the receiver side has closed the connection? And check what the the receiver is doing after the close?

Yes, it's our custom program.

The receiver sends a FIN after ~10 sec because we manually close the sever through Ctrl + C. Otherwise, the sever will keep receiving data more than we expected.

The sender didn't get the notification the receiver side has closed the connection.

Indeed, strange. We might need nstat as well here.

After several tries with bpf_rr scheduler, we were only able to reproduce the case of problem 2. Sorry for that.

iperf3 is stuck, as well as the whole client virtual machine.

Here is the result of the nstat in this case:

#kernel
MPTcpExtMPCapableSYNRX          2                  0.0
MPTcpExtMPCapableACKRX          2                  0.0
MPTcpExtMPTCPRetrans            1                  0.0
MPTcpExtMPJoinSynRx             2                  0.0
MPTcpExtMPJoinAckRx             1                  0.0
MPTcpExtDuplicateData           1                  0.0
MPTcpExtEchoAdd                 2                  0.0
MPTcpExtRcvWndShared            1                  0.0

Due to the result of MPTcpExtMPJoinSynRx and MPTcpExtMPJoinAckRx, my guess is that the BPF scheduler tries to transmit data to the subflow which has not been established.

FYI, we ended up discussion more about the packet scheduler API (and the behaviour of the default one) and more changes are needed: [#350](#350) (quite a bit... if you are interested ;-) )

Thank you for reminding.

@matttbe matttbe mentioned this issue Feb 23, 2023
9 tasks
@AlejandraOliver
Copy link

Hi, I'm currently trying to set up MPTCP in a virtual machine with Ubuntu 22.04 and Linux kernel 6.1.18. How can I select the scheduler that I want? When I compile the kernel I don't see the option to select it and neither does the net environment variable mptcp.scheduler. I would like to test with the round robin scheduler (bpf_rr). Could you tell me how to do it?Maybe I have to compile the bpf module?
Thanks in advance.

@AlejandraOliver
Copy link

Hello,

I'm currently interested in the packet scheduler using BPF and have done some experiments. However, I've run into some issues when using bpf_rr and bpf_red scheduler. There are three possibilities:

  1. The client keeps sending data to the server, but the server has received the expected amount of data.
  2. The application is stuck, and the ppid becomes 1, which can only be solved by restarting the system. (It seems that the scheduler tries to send data to an unestablished subflow at this time)
  3. RST might appear during data transmission.

Problem 1 usually occurs, sometimes problem 2 or 3.

However, using bpf_first scheduler doesn't cause any problems above.

Setup

My experiment setup is as follows:

Client

forward@forward-virtual-machine:~$ ip mptcp endpoint show
192.168.2.30 id 1 subflow fullmesh
192.168.2.29 id 2 subflow fullmesh
forward@forward-virtual-machine:~$ ip mptcp limits show
add_addr_accepted 1 subflows 3
forward@forward-virtual-machine:~$ hostnamectl
 Static hostname: forward-virtual-machine
       Icon name: computer-vm
         Chassis: vm
      Machine ID: c74952654b4c49f2b200fed4ca8eb8e1
         Boot ID: 8a19a42e95bf45178099968557309159
  Virtualization: vmware
Operating System: Ubuntu 22.04.1 LTS
          Kernel: Linux 6.1.0+
    Architecture: x86-64
 Hardware Vendor: VMware, Inc.
  Hardware Model: VMware Virtual Platform

Server

forward@forward-virtual-machine:~$ ip mptcp endpoint show
192.168.2.31 id 1 signal
192.168.2.32 id 2 signal
forward@forward-virtual-machine:~$ ip mptcp limits show
add_addr_accepted 1 subflows 3
forward@forward-virtual-machine:~$ hostnamectl
 Static hostname: forward-virtual-machine
       Icon name: computer-vm
         Chassis: vm
      Machine ID: 692e3a02614143b68f61aa642054ddcf
         Boot ID: 570c63dd032c4bd08c42be01322e79fb
  Virtualization: vmware
Operating System: Ubuntu 22.04.1 LTS
          Kernel: Linux 6.1.0+
    Architecture: x86-64
 Hardware Vendor: VMware, Inc.
  Hardware Model: VMware Virtual Platform

We tried to understand this problem more deeply through the BPF tracing tool.

Through stackcount, we found the following question:

sk_reset_timer is being called repeatedly.

We found this question very similar to the one mentioned in #52:

spurious retransmission and the mptcp_worker tries to update the MPTCP retransmission timer

I was wondering if this was the same issue or I made some mistakes?

I'm really open to disclose more information about the issue. Thank you in advance.

Hello, I would be interested to know how you selected the scheduler you wanted when testing. If you could give some orientation

@matttbe
Copy link
Member

matttbe commented Apr 6, 2023

Hi @AlejandraOliver,

This feature is still in progress, see #75 (and currently blocked by #350).

If you are interested by testing and improving it, feel free to compile the kernel using the source code from our repository, the export branch.

There is no so much documentation for the moment. It is not ideal but info can be extracted from the new BPF selftests: https://github.com/multipath-tcp/mptcp_net-next/commits/export

@FoRward-999
Copy link
Author

Hello @AlejandraOliver ,

To enable the bpf scheduler, you can follow these steps:

  1. Firstly, compile the kernel using the latest source code available on the export branch.

  2. Next, you'll need to compile and load the ebpf scheduler program, like bpf_rr.c.

  3. Lastly, use sysctl command to select the scheduler.

I hope this will help.

@AlejandraOliver
Copy link

Hi @FoRward-999 ,
after having compiled the kernel, I have compiled the file mptcp_bpf_rr.c using the 'clang' tool. Then, when loading it into the kernel I get the following error:

image

I would like to be able to test with this scheduler in kernel 6.1 so I would be grateful if you could tell me any orientation of to solve it.

Thanks.

@matttbe
Copy link
Member

matttbe commented Apr 11, 2023

I would like to be able to test with this scheduler in kernel 6.1 so I would be grateful if you could tell me any orientation of to solve it.

@AlejandraOliver : it can only work with a kernel compiled from our export branch, not the v6.1

@AlejandraOliver
Copy link

The kernel of your export branch is v6.3?

@matttbe
Copy link
Member

matttbe commented Apr 11, 2023

The kernel of your export branch is v6.3?

It is on top of the net-next branch which is on top of Linus branch.
The eBPF features are only in our tree, not in net-next nor Linus (nor v6.3).

@AlejandraOliver
Copy link

I have tried with the kernel of your export branch, I have compiled it and then, compiled the mptcp_bpf_rr.c with the clang tool. This creates an object file (mptcp_bpf_rr.o). Now, how do you load it in the kernel? I can´t use the bpftool tool and when I try to do it with 'insmod' I received errors.

Thanks.

@matttbe
Copy link
Member

matttbe commented Apr 12, 2023

I can´t use the bpftool tool

Why? Please share the commands that you used and the output. Please share that in text and not in a screenshot.

Also, what's the output of:

  • uname -a
  • sudo dmesg | grep -i mptcp
  • sysctl net.mptcp

@geliangtang / @FoRward-999 do you mind adding the commands that you used to compile and load one of the eBPF MPTCP Packet scheduler program please?

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 12, 2023

$ bpftool prog load /home/test2/mptcp_net-next/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.o /sys/fs/bpf/mptcp_bpf_rr

WARNING: bpftool not found for kernel 6.3.0
You may need to install the following packages for this specific kernel:
linux-tools-6.3.0-rc5+
linux-cloud-tools-6.3.0-rc5+
You may also want to install one of the following packages to keep up to date:
linux-tools-rc5+
linux-cloud-tools-rc5+


The output of the other commands are these:

$ uname -a
Linux test2-VirtualBox 6.3.0-rc5+ #3 SMP PREEMPT_DYNAMIC Wed Apr 12 05:45:56 CEST 2023 x86_64 x86_64 x86_64 GNU/Linux

$ sudo dmesg | grep -i mptcp
[ 0.445702] MPTCP token hash table entries: 8192 (order: 5, 196608 bytes, linear)

$ sysctl net.mptcp
net.mptcp.add_addr_timeout = 120
net.mptcp.allow_join_initial_addr_port = 1
net.mptcp.checksum_enabled = 0
net.mptcp.enabled = 1
net.mptcp.pm_type = 0
net.mptcp.scheduler = default
net.mptcp.stale_loss_cnt = 4


When I try to download some of the packages suggested for bpftool it tells me that they don't exist (I downloaded the kernel from your export branch).

@matttbe
Copy link
Member

matttbe commented Apr 12, 2023

@AlejandraOliver if I'm not mistaken, bpftool is just displaying a warning because you didn't use the one from the kernel 6.3. But I don't think it is failing (or we are missing the end of the output).

Can you not see the MPTCP packet scheduler using: sudo bpftool prog list

You should be able to set the new packet scheduler: sudo sysctl -w net.mptcp.scheduler=bpf_rr


To avoid the warnings with bpftool, you can compile bpftool, it should be easy, something like:

cd /home/test2/mptcp_net-next/tools/bpf/bpftool
make
sudo make install

Or follow instructions from: https://github.com/libbpf/bpftool

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 12, 2023

when I try to compile bpftool to be able to load the mptcp_bpf_rr.o that I created with clang tool into the kernel, I received the following error:

~/mptcp_net-next/tools/bpf/bpftool$ make


Auto-detecting system features:
...                         clang-bpf-co-re: [ on  ]
...                                    llvm: [ on  ]
...                                  libcap: [ OFF ]
...                                  libbfd: [ OFF ]


MKDIR   /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/
make[1]: enter from the directory '/home/test2/mptcp_net-next/tools/lib/bpf'
  GEN     /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/bpf_helper_defs.h
  MKDIR   /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/staticobjs/
  CC      /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/staticobjs/libbpf.o

In file included from /usr/include/asm-generic/int-ll64.h:12,
                 from /usr/include/asm-generic/types.h:7,
                 from /usr/include/x86_64-linux-gnu/asm/types.h:1,
                 from /home/test2/mptcp_net-next/tools/include/linux/types.h:13,
                 from /home/test2/mptcp_net-next/tools/include/linux/compiler.h:105,
                 from /home/test2/mptcp_net-next/tools/include/linux/err.h:5,
                 from libbpf.c:29:

/home/test2/mptcp_net-next/tools/include/asm-generic/bitsperlong.h:14:2: error: #error Inconsistent word size. Check asm/bitsperlong.h
   14 | #error Inconsistent word size. Check asm/bitsperlong.h
      |  ^~~~~
make[2]: *** [/home/test2/mptcp_net-next/tools/build/Makefile.build:98: /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/staticobjs/libbpf.o] Error 1
make[1]: *** [Makefile:157: /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/staticobjs/libbpf-in.o] Error 2
make[1]: exits from the directory '/home/test2/mptcp_net-next/tools/lib/bpf'
make: *** [Makefile:46: /home/test2/mptcp_net-next/tools/bpf/bpftool/libbpf/libbpf.a] Error 2

I I do $sudo bpftool prog list, I still received:

WARNING: bpftool not found for kernel 6.3.0
You may need to install the following packages for this specific kernel:

    linux-tools-6.3.0-rc5+
    linux-cloud-tools-6.3.0-rc5+

You may also want to install one of the following packages to keep up to date:

    linux-tools-rc5+
    linux-cloud-tools-rc5+

@matttbe
Copy link
Member

matttbe commented Apr 12, 2023

when I try to compile bpftool to be able to load the mptcp_bpf_rr.o that I created with clang tool into the kernel, I received the following error:

How did you compile the kernel before?
You are suppose to compile bpftool the same way: if you used additional parameters with make, you should do the same here.

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 12, 2023

Hi @matttbe
To compile the kernel I use these commands:

sudo make clean
make menuconfig
make -j`nproc`
make -j`nproc` bindeb-pkg

These create 4 packages .deb and I install them all and reboot the machine. One of the pacakges (dbg...) have errors but I don't think this is the problem.


If I use these commands (make -jnproc) with bpftool, I still receive the errors.


Maybe is the way I compile and install the kernel. What commands should I use to compile it instead of that ones?

@matttbe
Copy link
Member

matttbe commented Apr 12, 2023

To compile the kernel I use these commands:

@AlejandraOliver mmh, strange, the error seems to suggest there is an issue with the kernel config or with cross compilation.

It works for me with make -C tools/bpf/bpftool (after having enabled CONFIG_DEBUG_INFO_BTF).

Maybe try to clean, re-compile the kernel, then bpftool?

If it is not enough, the best is to look at https://github.com/libbpf/bpftool, try to compile bpftool from there and report issue there.

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 12, 2023

@matttbe : Could you tell me which commands do you use to compile the kernel?

Maybe that's the problem because I have enabled CONFIG_DEBUG_INFO_BTF (CONFIG_DEBUG_INFO_BTF=y) and before compile the kernel I have installed the required dependencies that are listed in Documentation/process/changes.rst.

@matttbe
Copy link
Member

matttbe commented Apr 12, 2023

The best is probably to ask bpftool maintainers :-/

I compiled the kernel with make -j(...) O=(...) but that's not really different from what you did.

You can also retry from scratch (make mrproper or git clean -ffdx) (or ask for help to BPFTool maintainers)

@AlejandraOliver
Copy link

@FoRward-999 could you comment the commands that you use to compile and load the BPF packet scheduler in the kernel?

Thanks in advance.

@FoRward-999
Copy link
Author

Hello @AlejandraOliver
the commands are:
sudo clang -O2 -target -bpf -g -c mptcp_bpf_rr.c -o mptcp_bpf_rr.o
sudo bpftool struct_ops register mptcp_bpf_rr.o

@AlejandraOliver
Copy link

Hi @FoRward-999,
first of all, thank you for your last response. I have tried to do the first command (clang) but I receive this error:

test2@test2-VirtualBox:~/mptcp_net-next/tools/testing/selftests/bpf/progs$ sudo clang -O2 -target -bpf -g -c mptcp_bpf_rr.c -o mptcp_bpf_rr.o

In file included from mptcp_bpf_rr.c:4:
In file included from /usr/include/linux/bpf.h:11:
/usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found
#include <asm/types.h>
^~~~~~~~~~~~~
1 error generated.


I don't know why I get this error since I guess you didn't get it. The asm-types.h file is in the mptcp_net-next folder. How can I fix it?

@FoRward-999
Copy link
Author

Try this:
sudo ln -s /usr/include/x86_64-linux-gnu/asm /usr/include/asm

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 15, 2023

Thank you very much!! . This resolve the problem. When I do sudo clang -O2 -target -bpf -g -c mptcp_bpf_rr.c -o mptcp_bpf_rr.o I receive the following line:
Registered mptcp_sched_ops rr id 4.

The thing is that if I do sudo bpftool prog show I don't see it (id 4) but in /sys/fs/bpf/ it have create init, get_subflow, etc. So, I am supposed to be able to set net.scheduler=bpf_rr now and with doing the kselftest it works?

@FoRward-999
Copy link
Author

Yes, I think so.

The thing is that if I do sudo bpftool prog show I don't see it (id 4)

Try: sudo bpftool struct_ops show

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 15, 2023

I've got this output when running sudo bpftool struct_ops show:

4: rr              mptcp_sched_ops                
7: red             mptcp_sched_ops

Is that correct? So I can do the kselftest like mptcp_connect.sh and mptcp_join.sh and check it with tcpdump?

@FoRward-999
Copy link
Author

Yes, it's correct. I didn't do the kselftest, you can give it a try.

@AlejandraOliver
Copy link

Ok, thanks!!

@AlejandraOliver
Copy link

AlejandraOliver commented Apr 19, 2023

Hi @FoRward-999 , were you able to fix the problem with bpf_rr?:

Problem 3:

Server side:

forward@forward-virtual-machine:~$ mptcpize run iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 0:200:4956:0:400:0:fe7f:0, port 0
iperf3: error - unable to receive control message: Connection reset by peer
-----------------------------------------------------------

I have the same output in iperf3 when using this scheduler

@FoRward-999
Copy link
Author

No, I still have this problem.

@AlejandraOliver
Copy link

Ok, thanks!!

@allen0091
Copy link

allen0091 commented May 7, 2023

Hello @FoRward-999. When I execute the command sudo clang -O2 -target -bpf -g -c mptcp_bpf_rr.c -o , i have this problem.

mptcp_bpf_rr.c:5:10: fatal error: 'bpf_tcp_helpers.h' file not found
#include "bpf_tcp_helpers.h"

The bpf_tcp_helpers.h is in ~/mptcp_net-next/tools/testing/selftests/bpf , How can i solve it?

@FoRward-999
Copy link
Author

Hello @allen0091
I think you should put mptcp_bpf_rr.c and bpf_tcp_helpers.h under the same directory.

@geliangtang geliangtang added bpf sched packets scheduler labels Aug 4, 2023
matttbe pushed a commit that referenced this issue Jul 10, 2024
Add a test case which replaces an active ingress qdisc while keeping the
miniq in-tact during the transition period to the new clsact qdisc.

  # ./vmtest.sh -- ./test_progs -t tc_link
  [...]
  ./test_progs -t tc_link
  [    3.412871] bpf_testmod: loading out-of-tree module taints kernel.
  [    3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #332     tc_links_after:OK
  #333     tc_links_append:OK
  #334     tc_links_basic:OK
  #335     tc_links_before:OK
  #336     tc_links_chain_classic:OK
  #337     tc_links_chain_mixed:OK
  #338     tc_links_dev_chain0:OK
  #339     tc_links_dev_cleanup:OK
  #340     tc_links_dev_mixed:OK
  #341     tc_links_ingress:OK
  #342     tc_links_invalid:OK
  #343     tc_links_prepend:OK
  #344     tc_links_replace:OK
  #345     tc_links_revision:OK
  Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240708133130.11609-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
matttbe pushed a commit that referenced this issue Nov 22, 2024
Add three tests for struct_ops using private stack.
  ./test_progs -t struct_ops_private_stack
  #336/1   struct_ops_private_stack/private_stack:OK
  #336/2   struct_ops_private_stack/private_stack_fail:OK
  #336/3   struct_ops_private_stack/private_stack_recur:OK
  #336     struct_ops_private_stack:OK

The following is a snippet of a struct_ops check_member() implementation:

	u32 moff = __btf_member_bit_offset(t, member) / 8;
	switch (moff) {
	case offsetof(struct bpf_testmod_ops3, test_1):
        	prog->aux->priv_stack_requested = true;
                prog->aux->recursion_detected = test_1_recursion_detected;
        	fallthrough;
	default:
        	break;
	}
	return 0;

The first test is with nested two different callback functions where the
first prog has more than 512 byte stack size (including subprogs) with
private stack enabled.

The second test is a negative test where the second prog has more than 512
byte stack size without private stack enabled.

The third test is the same callback function recursing itself. At run time,
the jit trampoline recursion check kicks in to prevent the recursion. The
recursion_detected() callback function is implemented by the bpf_testmod,
the following message in dmesg
  bpf_testmod: oh no, recursing into test_1, recursion_misses 1
demonstrates the callback function is indeed triggered when recursion miss
happens.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20241112163938.2225528-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bpf bug sched packets scheduler
Projects
None yet
Development

No branches or pull requests

5 participants