-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S390x kpatch support #1203
S390x kpatch support #1203
Conversation
@sumanthkorikkar : Hi Sumanth, thanks for posting. I'll try to give it a spin next week. In the meantime, two quick questions:
|
Hi Joe, Thank you.
|
The gcc patches are back portable from gcc11. Thanks. |
Great thanks for the update. FWIW, I tried with the stock rhel-alpha gcc and was able to build about 60% of the integration tests successfully. I haven't tried with an updated / backported gcc just yet, but I think that will fix the remaining cases. By the way, if IBM can backport those commits to the upstream gcc-11 branch, they should be included as part of the rhel-9.1 gcc update. Quick note about the github CI -- it ran |
Hi Sumanth, I had better luck with the integration tests from this branch (https://github.com/joe-lawrence/kpatch/tree/integration-tests-kernel-5.14.0-0.rc3.29.el9) and the gcc backports that you mentioned. I did not try rebuilding the kernel with that version of gcc, so these may be false reports, but I found problems with two of the tests: data_read_mostly.patch
meminfo-string.patch
I did try putting the second one into gdb, and I think there may be something suspect with create-diff-object's |
Hi Joe, Thank you. I will check these cases. |
5e1d81c
to
5f00359
Compare
V2 changes:
Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sumanthkorikkar , I haven't found much time to review the PR code, but I listed a few nits and questions when taking a quick look. I'll follow up with some more integration test results...
kpatch-build/create-diff-object.c
Outdated
log_debug("lookup for %s: obj=%s sympos=%lu size=%lu", | ||
sym->name, symbol.objname, symbol.sympos, | ||
log_debug("lookup for %s: obj=%s sympos=%lu size=%lu\n", | ||
sym->name, symbol.objname, symbol.sympos, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: adding \n
is good, but the second line introduces spaces (in front of sym->name
) instead of a tab
kpatch-build/create-diff-object.c
Outdated
* They can be used as normal relas. | ||
*/ | ||
if (strstr(rela->sym->name, "s390_indirect_jump")) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there instances where s390_indirect_jump
relocation would have rela->need_dynrela
set? FWIW, when building the integration tests, I didn't see any examples where this happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You are right. I removed it.
kpatch-build/kpatch-elf.c
Outdated
@@ -193,7 +211,10 @@ void kpatch_create_rela_list(struct kpatch_elf *kelf, struct section *sec) | |||
ERROR("could not find rela entry symbol\n"); | |||
if (rela->sym->sec && | |||
(rela->sym->sec->sh.sh_flags & SHF_STRINGS)) { | |||
rela->string = rela->sym->sec->data->d_buf + rela->addend; | |||
if(lc_string(rela)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: whitespace: s/if(/if (/
Hi @sumanthkorikkar , I have been busy working on rebasing the integration tests for RHEL-9, which will be based on v5.14. It's been slow since I've been fighting a few x86_64 and ppc64le problems at the same time, but I think I'm getting close to finishing. I setup three branches:
With integration-tests-kernel-5.14.0-1.el9+workarounds+s390x, I can build all of the integration patches and load all but one with: Two interesting things to report: 1 - bug-table-section.patch, gcc-isra.patch.build, gcc-mangled-3.patch, macro-callbacks.patch all build, but when inspecting their
However, kpatch-build produces a .ko and doesn't report failure. I don't believe this behavior is unique to this merge request, but we should definitely investigate why 2 - bug-table-section.patch crashes on load once executing replacement code:
Let me know if you would like a vmlinux/ vmcore for this one. |
Hi @joe-lawrence, Thank you. I am able to reproduce livepatch_bug_table_section crash in one of my machines. I will check these issues. |
c01916b
to
35423a6
Compare
Hi @joe-lawrence, v3 Changes:
Thank you |
35423a6
to
8941d24
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sumanthkorikkar, thanks for updating, this PR is coming along nicely: I can build all and load all of the integration tests in integration-tests-kernel-5.14.0-1.el9+workarounds with the latest set of changes.
See review comments in the code, but some higher level PR feedback:
- I opened kpatch-build: ignores ERROR() from create-kpatch-module and create-klp-module #1223 and suggest that you spin off the bug fix for ("kpatch-build: Handle error in create-klp-module") into its own PR. This will separate the s390x-specific work from incidental bugfixes along the way.
- Before we merge, shouldn't ("Enable kpatch build support for s390x" to end of list") move to the back of the commit order? (Tacking them onto the end is good for review, but once we merge into the tree, we can flip on support as the final step.)
- One additional level of testing we haven't talked about yet are the unit tests. Take a look at https://github.com/dynup/kpatch-unit-test-objs to see the git submodule that lives under this project's
test/unit/objs
. @sm00th : what kind of arch-specific unit tests would you like to see when we add s390x support?
@@ -22,7 +22,7 @@ GCC_PLUGINS_DIR := $(shell gcc -print-file-name=plugin) | |||
PLUGIN_CFLAGS := $(filter-out -Wconversion, $(CFLAGS)) | |||
PLUGIN_CFLAGS += -shared -I$(GCC_PLUGINS_DIR)/include \ | |||
-Igcc-plugins -fPIC -fno-rtti -O2 -Wall | |||
else | |||
else ifneq ($(ARCH),s390x) | |||
$(error Unsupported architecture ${ARCH}, check https://github.com/dynup/kpatch/#supported-architectures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on how confident we feel after this PR, we could update the referenced README.md #supported-architectures
section to include s390x. Or would you prefer we merge and test internally before advertising support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this is merged, we can update s390x as supported in README.md.
kpatch-build/kpatch-elf.c
Outdated
data = groupsec->data->d_buf; | ||
end = groupsec->data->d_buf + groupsec->data->d_size; | ||
data++; /* skip first flag word (e.g. GRP_COMDAT) */ | ||
gsec = malloc(sizeof(*gsec)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be the only one who resorts to using Valgrind now and then to debug problems in this space, but can we clean up these new allocations. No need to worry about all the error path scenarios, but at least successful execution will run kpatch_elf_teardown()
on its way out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Will do the cleanup of allocations.
kpatch-build/kpatch-elf.c
Outdated
continue; | ||
data = sec->data->d_buf; | ||
end = sec->data->d_buf + sec->data->d_size; | ||
newdata = malloc(sec->data->d_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise for these allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup of this allocation performed in kpatch_free_groupsec.
kpatch-build/kpatch-elf.c
Outdated
sec->sh.sh_info = tsym->index; | ||
data++; | ||
} | ||
sec->data->d_buf = newdata; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think this leaks the previous sec->data->d_buf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldnt free it (Elf_Data). Modifying the memory referenced by sec->data->d_buf or Freeing this results in segfault.
kpatch-build/kpatch-build
Outdated
@@ -1149,6 +1155,7 @@ else | |||
rm -f checksum.tmp | |||
"$TOOLSDIR"/create-kpatch-module "$TEMPDIR"/patch/tmp_output.o "$TEMPDIR"/patch/output.o 2>&1 | logger 1 | |||
check_pipe_status create-kpatch-module | |||
test "$rc" -ne 0 && die "create-kpatch-module: exited with return code: $rc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note in review summary about pulling this out into its own pull request. Also, just a style nit, but the rest of the script uses [[ $rc -ne 0 ]] && ...
type syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok sure. I will send out a separate pull request for this
kpatch-build/create-klp-module.c
Outdated
@@ -502,6 +535,8 @@ int main(int argc, char *argv[]) | |||
kpatch_create_shstrtab(kelf); | |||
kpatch_create_strtab(kelf); | |||
kpatch_create_symtab(kelf); | |||
kpatch_dump_kelf(kelf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed for future debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I wanted to print the final dump coming out of create-klp-module. For debugging, It was beneficial to me. However, we could also move it to the end.
We can start off with the bulk of generic tests we have for x86 and add specific stuff later on as we encounter arch-specific issues. |
b44b2f8
to
c6d8b56
Compare
Hi @joe-lawrence, v4 changes:
Thank you |
I've committed the GCC patches mentioned in #1203 (comment) to the https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580562.html |
Cool, thanks for the update, Ilya. Support for s390x is coming together nicely 👍 As far as the current outstanding kernel work is concerned, I think we're looking for PeterZ's recent patchset to help the large / idle cpu cause: [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Any kernel updates, aside from torvalds/linux@7561c14d8a4d, the already merged linker script update? |
This commit is still dirty, and is jumbling cut & pasted referring to dynup#1010 and dynup#1203.
This commit is still dirty because I'm jumbling cut & pasted referring to dynup#1010 and dynup#1203.
Sorry that should have read, "Are there any other post v5.14 kernel updates..." How about these additional patches from [PATCH 0/2] s390/ftrace: implement hotpatching: |
@sumanthkorikkar : Regarding To test this out, try building: It should introduce about a half-dozen line number changes at all of the A few other notes:
For s390x, I don't see an analog to the kernel's tools/arch/x86/lib/insn code... so we may need to create something similar or stripped down. (If nothing exists, we could probably task an intern with collating something out of the z/Architecture Reference Summary) |
symtab_read tries to skip '.dynsym' symbol table and only read '.symtab' symbol table. Newer readelf from binutils 2.37 now adds section names (see the diff): --- vmlinux.symtab 2022-02-18 02:10:06.691220932 +0100 +++ vmlinux.symtab.new 2022-02-18 01:16:06.161210458 +0100 Symbol table '.dynsym' contains 1541 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000100000 0 SECTION LOCAL DEFAULT 1 .text 2: 00000000017a3ac0 4 OBJECT GLOBAL DEFAULT 19 sclp_console_pages Symbol table '.symtab' contains 159980 entries: Num: Value Size Type Bind Vis Ndx Name - 41: 0000000001a93600 0 SECTION LOCAL DEFAULT 41 - 42: 0000000001a9c678 0 SECTION LOCAL DEFAULT 42 ... + 41: 0000000001a93600 0 SECTION LOCAL DEFAULT 41 .dynsym + 42: 0000000001a9c678 0 SECTION LOCAL DEFAULT 42 .rela.dyn ... 54: 0000000000000000 0 FILE LOCAL DEFAULT ABS main.c Simple matching of ".dynsym" in the line buffer is not enough anymore, because it hits not just Symbol table '.dynsym' contains 1541 entries: line, but also 41: 0000000001a93600 0 SECTION LOCAL DEFAULT 41 .dynsym skipping the rest of the file and leading to an error: create-diff-object: ERROR: *.o: find_local_syms: 189: couldn't find matching *.c local symbols in vmlinux symbol table Limit matching only to lines containing "Symbol table" header. This works with readelf from the binutils, as well as readelf from elfutils (its output looks slightly different). Symbol table [41] '.dynsym' contains 1541 entries: Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
1. -mno-pic-data-is-text-relative prevents relative addressing between code and data. This is needed to avoid relocation error when klp text and data are too far apart 2. Avoid generation of LANCHOR symbols through -fno-section-anchors. kpatch-build does not handle it well. Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
* Add s390 specific checks * Identify patchable functions. * Dont mark expolines as dynrelas. These expolines are always included in final kernel module. This ensures that expoline functions and the kernel itself are not too far apart and avoids out of range relocation. However, this isnt a problem for other functions, as these relocations are performed via R_390_PLT32DBL using gcc option -mno-pic-data-is-text-relative. * s390 maintains expoline tables to locate the expoline thunks. If needed, the module loader could later replace these expoline thunks with normal indirect branch. Each element in the expoline table is of 4 bytes. If there is a changed function in rela.s390_return*, then mark that specific rela symbol as included. This is already performed in the processing of special sections. Hence include it. Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Add object exclusion for s390 Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
1. static const struct inet_sock fake_sk = { /* makes ip6_route_output set RT6_LOOKUP_F_IFACE: */ .sk.sk_bound_dev_if = 1, .pinet6 = (struct ipv6_pinfo *) &fake_pinfo, }; gcc can place fake_sk in .data.rel.ro.local: ndx 38, data 0x3ffb3280a58, size 960, name .data.rel.ro.local.fake_sk.1 ndx 39, data 0x3ffb32be5e8, size 24, name .rela.data.rel.ro.local.fake_sk.1 2. static LIST_HEAD(patch_objects); gcc can place patch_objects relocation in .data.rel.local: sym 56, type 1, bind 0, ndx 34, name patch_objects -> .data.rel.local Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
dd8a49a
to
65a8f92
Compare
Thank you for the review. Rebased the changes and tested. |
- gcc-mirror/gcc@8753b13 IBM Z: fix section type conflict with -mindirect-branch-table | ||
|
||
**Prerequisite kernel patches:** | ||
- 69505e3d9a39 bug: Use normal relative pointers in 'struct bug_entry' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely this will end up in 5.19, so it's probably safe to add a v5.19 header for this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
* Add s390 as supported. * Add backporting information for the distros. Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
65a8f92
to
48d997f
Compare
@sumanthkorikkar Thanks for your patience! This looks good to me. |
I know there are still some upstream kernel patches that need to make their way down to distro kernels, but shall we merge this now before the main branch drifts any further? @sumanthkorikkar @jpoimboe |
@joe-lawrence do we have the integration testing CI setup for testing s390 with a new enough kernel/toolchain? I think that would be helpful, maybe a separate issue can be opened to track that (for whatever "tracking" means considering our unfortunately large number of neglected open issues). But we at least have unit tests, and these patches are very nicely self contained, so yeah, I think it would be a good idea to go ahead and merge this now before it starts getting conflicts again. |
Yes, Looking forward to get this into the main branch. Thanks |
Compiler support should be available, we're still waiting on the external expolines MR and a subsequent ftrace backport (probably targeting 9.2 at this point). Since CI is run internally, we can always add it to one of our TODO lists. |
Great, thanks for all your work on this functionality. Regarding kernel backports, I posted this question [1] the other day. It popped up as we noticed a missing expolines.o when building the tools/testing/selftests/bpf/bpf_testmod.ko (which builds as a sort of OOT module). [1] https://lore.kernel.org/linux-s390/8569429d-57f8-a0cf-8b17-1291f6973d32@redhat.com/T/#u |
Noted. Thanks. We are looking into that. |
The CI could be on a v5.18 kernel which I think has all the prereqs. |
@sm00th : have we ever setup a CI instance running w/upstream kernel? |
The beaker jobs have |
Any update on this pull request? Is anything blocking this from being merged? |
Nothing more to do for this PR, however we would like to add s390x support to our internal integration tests -- these automatically run whenever a PR is posted or updated and when the master branch is updated. Unfortunately the rhel kernel has not yet backported all the needed kernel patches (internal build/packaging is blocked on [1]), so that means we need to target an upstream v5.18 kernel. For that, the integration tests needed to be rebased [2] and tested for all three arches (x86_64, ppc64le, s390x). While doing so, I regenerated the unit test object files [3] so that they now include the external expolines. There is a bit more to it (bumping the unit test submodule reference, updating our internal integration test jobs, etc.) that are independent from this PR, but I was hoping to line all that up to run when we merge it. If the stars don't align by next week, I'll just merge the PR and manually kick off those internal jobs. [1] https://lore.kernel.org/linux-s390/8569429d-57f8-a0cf-8b17-1291f6973d32@redhat.com/T/#u |
5.18 integration tests merged, let's get this one in. I'll figure out internal testing later. |
Merge pull request dynup#1203 from sumanthkorikkar/s390x-kpatch-support
Upstream-status: RHEL-only Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2072713 s390x support in upstream kpatch tooling will have a requirement on thunk-extern [1] as "external" expoline code will simplify binary ELF comparison and generation of kpatch kernel modules. Turning on CONFIG_EXPOLINE_EXTERN will place expolines in arch/s390/lib/expoline.o, which must be available to any out-of-tree module build. Ship the file in the -devel package if it is found. [1] dynup/kpatch#1203 (comment) Signed-off-by: C. Erastus Toe <ctoe@redhat.com>
Hi All,
Add s390x support for kpatch
Prerequisite gcc patches:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a1c1b7a888ade6f21bc7c7f05a2cbff290273fcc
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0990d93dd8a4268bff5bbe48aa26748cf63201c7
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=935b5226c385e34088c314374cbbe9e4995b9e44
Prerequisite kernel patches:
https://lore.kernel.org/live-patching/YObPhPkzRSqnzgK3@alley/T/#me6c2fcdbd4f582d5cc5c594bafb083ba814acdf9 - Yet to be upstream.
https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=for-next&id=7561c14d8a4d1a24a40b1839d927d488e2d6345a
Recommended Patches:
torvalds/linux@de5012b
torvalds/linux@67ccddf
Tested on kernel 5.14:
KPATCHBUILD_OPTS="-v /home/extend/vmlinux -s /home/extend/linux-devel/linux/ -d" ./kpatch-test -d linux-5.14.0/ -q
build: combined module
load test: combined module
SUCCESS
Let me know your suggestions.
Thank you.
--
Sumanth