-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: Test against kernel 5.18 #668
Conversation
How urgent is this? |
We can dump the BTF to figure out what types this refers to:
We think the instruction should reference a
From the last line we know that applying the fixup at offset 302 fails. The BTF relocation
At this point it looks like the encoded instructions and the encoded CO-RE relocations disagree. Maybe this is related to which tool we're using to do the linking? An interesting experiment would be if @ti-mo rebuilt the 5.15 selftests on his machine, and then run the unit test against that. If that changes the outcome its a tooling issue, if not it might be a bug in 5.17. Another thing to try is running the same example via libbpf and looking at the output it generates for this particular test case. Diff for my quick hack: diff --git a/internal/btf/ext_info.go b/internal/btf/ext_info.go
index 2c0e1af..e1c9996 100644
--- a/internal/btf/ext_info.go
+++ b/internal/btf/ext_info.go
@@ -124,6 +124,7 @@ func (ei *ExtInfos) Assign(insns asm.Instructions, section string) {
}
if len(reloInfos) > 0 && reloInfos[0].offset == iter.Offset {
+ fmt.Println("offset", iter.Offset, iter.Ins, reloInfos[0].relo, "(into metadata)")
iter.Ins.Metadata.Set(coreRelocationMeta{}, reloInfos[0].relo)
reloInfos = reloInfos[1:]
}
@@ -610,6 +611,10 @@ type CORERelocation struct {
kind coreKind
}
+func (cr *CORERelocation) String() string {
+ return fmt.Sprintf("%s of %s (%s)", cr.kind, cr.typ, cr.accessor)
+}
+
func CORERelocationMetadata(ins *asm.Instruction) *CORERelocation {
relo, _ := ins.Metadata.Get(coreRelocationMeta{}).(*CORERelocation)
return relo
@@ -653,6 +658,7 @@ func newRelocationInfos(brs []bpfCORERelo, ts types, strings *stringTable) ([]co
if err != nil {
return nil, fmt.Errorf("offset %d: %w", br.InsnOff, err)
}
+ fmt.Println("offset", relo.offset, relo.relo, "local type id", br.TypeID, "(from ext info)")
rs = append(rs, *relo)
}
sort.Slice(rs, func(i, j int) bool {
@@ -713,7 +719,7 @@ func parseCOREReloRecords(r io.Reader, bo binary.ByteOrder, recordSize uint32, r
// ELF tracks offset in bytes, the kernel expects raw BPF instructions.
// Convert as early as possible.
relo.InsnOff /= asm.InstructionSize
-
+ fmt.Println("offset", relo.InsnOff, relo.Kind, relo.TypeID, "(from wire)")
out = append(out, relo)
}
diff --git a/linker.go b/linker.go
index 60cb7a6..723077c 100644
--- a/linker.go
+++ b/linker.go
@@ -117,11 +117,14 @@ func findReferences(progs map[string]*ProgramSpec) error {
func applyRelocations(insns asm.Instructions, local, target *btf.Spec) error {
var relos []*btf.CORERelocation
var reloInsns []*asm.Instruction
+ var reloOffsets []asm.RawInstructionOffset
iter := insns.Iterate()
for iter.Next() {
if relo := btf.CORERelocationMetadata(iter.Ins); relo != nil {
+ fmt.Println("offset", iter.Offset, iter.Ins, relo, "(from metadata)")
relos = append(relos, relo)
reloInsns = append(reloInsns, iter.Ins)
+ reloOffsets = append(reloOffsets, iter.Offset)
}
}
@@ -144,6 +147,7 @@ func applyRelocations(insns asm.Instructions, local, target *btf.Spec) error {
for i, fixup := range fixups {
if err := fixup.Apply(reloInsns[i]); err != nil {
+ fmt.Println("offset", reloOffsets[i], reloInsns[i], relos[i], "(apply)")
return fmt.Errorf("apply fixup %s: %w", &fixup, err)
}
}
|
@lmb thanks for the insights! Will take a closer look in a bit. |
I re-ran the tests against 5.17 that was built with ci-kernels-builder and the result stays the same! |
c3f46b1
to
f4ed59b
Compare
Updated this for 5.18, the error still persists. Also tried updating pahole in the build container to https://packages.debian.org/bullseye-backports/pahole but that didn't change things either. |
Wow, really interesting failure on go1.17:
This CL shipped in Go 1.18: https://go-review.googlesource.com/c/go/+/375216. Since .bss is a 'virtual' section, it's not actually allocated in the ELF (nothing shocking), it always shares an offset with another section, e.g.:
(offsets are hex) Since In conclusion, in Go 1.17 and earlier, reading NOBITS sections actually gets you the bytes of another section that sits at the same offset. We should probably mimic this behaviour in the lib as well; there is no reason to try and read NOBITS sections to begin with. Maybe we don't even need to allocate + populate their MapSpec.Contents, but I'd need to refresh my memory on the relo handling. @lmb WDYT? |
Turns out they do get loaded into the kernel and even get read from bytecode, although all at index 0. Seems wasteful to create large maps for no reason, but that's for the compiler to decide. For now, this should fix it: #740. |
Signed-off-by: Robin Gögge <r.goegge@gmail.com>
Signed-off-by: Robin Gögge <r.goegge@gmail.com>
5.18 added two new BTF kinds TYPE_TAG and DECL_TAG which we don't support at the moment. See cilium#713
Trying to load the ELF gives the following error: elf_reader_test.go:665: Error during loading: program trace_netif_receive_skb: apply CO-RE relocations: apply fixup target_type_id=67->16253: invalid immediate 73, expected 67 (fixup: target_type_id=67->16253) After some cursory digging this doesn't seem to be a bug in the library but maybe one in either pahole or clang. See cilium#739
Finally! |
After cilium/ci-kernels#24, now test the library against kernel 5.17.
@lmb @joamaki There's one remaining failing selftest:
Could this be a bug on our side? Nothing was changed to
netif_receive_skb.c
in over a year, so not sure why this broke now.fyi @rgo3