Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bolt] fix a wrong relocation update issue with weak references #69136

Merged
merged 1 commit into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion bolt/lib/Rewrite/RewriteInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2067,6 +2067,14 @@ bool RewriteInstance::analyzeRelocation(
if (!Relocation::isSupported(RType))
return false;

auto IsWeakReference = [](const SymbolRef &Symbol) {
Expected<uint32_t> SymFlagsOrErr = Symbol.getFlags();
if (!SymFlagsOrErr)
return false;
return (*SymFlagsOrErr & SymbolRef::SF_Undefined) &&
(*SymFlagsOrErr & SymbolRef::SF_Weak);
};

const bool IsAArch64 = BC->isAArch64();

const size_t RelSize = Relocation::getSizeForType(RType);
Expand Down Expand Up @@ -2098,7 +2106,8 @@ bool RewriteInstance::analyzeRelocation(
// Section symbols are marked as ST_Debug.
IsSectionRelocation = (cantFail(Symbol.getType()) == SymbolRef::ST_Debug);
// Check for PLT entry registered with symbol name
if (!SymbolAddress && (IsAArch64 || BC->isRISCV())) {
if (!SymbolAddress && !IsWeakReference(Symbol) &&
(IsAArch64 || BC->isRISCV())) {
const BinaryData *BD = BC->getPLTBinaryDataByName(SymbolName);
SymbolAddress = BD ? BD->getAddress() : 0;
}
Expand Down
34 changes: 34 additions & 0 deletions bolt/test/AArch64/update-weak-reference-symbol.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// This test checks whether BOLT can correctly handle relocations against weak symbols.

// RUN: %clang %cflags -Wl,-z,notext -shared -Wl,-q %s -o %t.so
// RUN: llvm-bolt %t.so -o %t.so.bolt
// RUN: llvm-nm -n %t.so.bolt > %t.out.txt
// RUN: llvm-objdump -dj .rodata %t.so.bolt >> %t.out.txt
// RUN: FileCheck %s --input-file=%t.out.txt

# CHECK: w func_1
# CHECK: {{0+}}[[#%x,ADDR:]] W func_2

# CHECK: {{.*}} <.rodata>:
# CHECK-NEXT: {{.*}} .word 0x00000000
# CHECK-NEXT: {{.*}} .word 0x00000000
# CHECK-NEXT: {{.*}} .word 0x{{[0]+}}[[#ADDR]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a discussion, another thing Maksim pointed out is that this actually needs to be zero too, because this part is resolved by dynamic linker at runtime. I've checked and the pre-bolt binary is indeed zeroed here. Currently, BOLT makes this non-zero because we're processing dynamic relocs for aarch64, for the reasons I discussed above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that before it is 0 here. But it is up to the linker to set value here or not. and for example to support RELR we need to set values here. I don't think it is a problem and would rather set the value, than not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that before it is 0 here. But it is up to the linker to set value here or not. and for example to support RELR we need to set values here. I don't think it is a problem and would rather set the value, than not.

Can we rely on the linker's decision in such cases? I.e., if the linker decided to put 0, keep it at that. If it was the symbol value, then we can update it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that before it is 0 here. But it is up to the linker to set value here or not. and for example to support RELR we need to set values here. I don't think it is a problem and would rather set the value, than not.

Can we rely on the linker's decision in such cases? I.e., if the linker decided to put 0, keep it at that. If it was the symbol value, then we can update it.

Theoretically we can. But is there a reason behind it? It doesn't affect runtime. And to be honest I prefer to set the values, we can easily see the value during objdump, sometimes it is useful.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yota9 why should BOLT change this? Isn't leaving it to linker the right thing to do and also less risky in terms of correctness?

Copy link
Member

@yota9 yota9 Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s-dag Currently I don't see any risks. If there would be some troubles I would try to handle such situations in separate patch, but right now I didn't see binaries where we should worry about these changes. Anyway this question is out of scope of this patch.

# CHECK-NEXT: {{.*}} .word 0x00000000

.text
.weak func_2
.weak func_1
.global wow
.type wow, %function
wow:
bl func_1
bl func_2
ret
.type func_2, %function
func_2:
ret
.section .rodata
.LC0:
.xword func_1
.LC1:
.xword func_2