-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ThinLTO] Missing vtable COMDAT symbol after r350948 #39760
Comments
Do you have a reproducer you can share? Any chance it is using a mix of old and new bitcode? Note that the title of this bug and the description are referencing the wrong revisions - they should be r35094[89] and not r34094[89]. Let me adjust the title... |
Are you referring to a linkrepro? Or would you need to start from scratch with cpp files? (Not sure whether this happens in the compiler or linker) I used clobber builds for my bisection so there shouldn't be any mix of bitcodes.
You're right: r35094[89]. Though the title still has a typo :) Adjusting again. |
Just the link should be sufficient.
Ugh, thanks. |
Just a guess, looking at the source code. The vtable will be local since it is defined in an anonymous namespace. If ThinLTO ends up exporting a reference, it will need to promote/rename this symbol. And COFF apparently requires that each COMDAT retains a symbol of the same name. With splitting enabled, we would hit the code in the ThinLTOBitcodeWriter that handles this for COFF when promotion/renaming is required by the module splitting (see https://reviews.llvm.org/D31963 which fixed it there). My guess is that we need to do something similar even when we aren't splitting. |
Sent a Google Drive link to a linkrepro archive. |
Thanks, reproduced it. Will try to root cause it today. |
In the meantime, can you give your build a try with -fsplit-lto-unit and confirm that it succeeds? That should restore the previous behavior, and you can use it as a workaround for now. |
This theory is borne out by the save temp files in the LTO backend. Going into the backend for ucol_res.obj: $"??_7KeywordsSink@?A0x23271001@@6b@" = comdat any @anon.1f1d01ebea411e7039392aebde7ed70e.5 = private unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* bitcast (%rtti.CompleteObjectLocator* @"??_R4KeywordsSink@?A0x23271001@@6b@" to i8*), i8* bitcast (i8* (%"struct.(anonymous namespace)::KeywordsSink", i32) @"??_GKeywordsSink@?A0x23271001@@UEAAPEAXI@Z" to i8*), i8* bitcast (i8* (%"class.icu_63::UObject") @"?getDynamicClassID@UObject@icu_63@@UEBAPEAXXZ" to i8*), i8* bitcast (void (%"struct.(anonymous namespace)::KeywordsSink", i8, %"class.icu_63::ResourceValue", i8, i32)* @"?put@KeywordsSink@?A0x23271001@@UEAAXPEBDAEAVResourceValue@icu_63@@CAEAW4UErrorCode@@@z" to i8*)] }, comdat($"??_7KeywordsSink@?A0x23271001@@6b@"), !type !4043, !type !4048, !type !4049 @"??_7KeywordsSink@?A0x23271001@@6b@" = internal unnamed_addr alias i8*, getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @anon.1f1d01ebea411e7039392aebde7ed70e.5, i32 0, i32 0, i32 1) And after promotion (because a reference to it is exported), the latter becomes: @"??_7KeywordsSink@?A0x23271001@@6b@.llvm.11791635078693174447" = hidden unnamed_addr alias i8*, getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @anon.1f1d01ebea411e7039392aebde7ed70e.5, i32 0, i32 0, i32 1) So now there is no value that matches the COMDAT name, resulting in an error for COFF. What we need to do to fix should be pretty straightforward - in the LTO backend when we promote/rename, if it is the name of the COMDAT we also rename the COMDAT accordingly. This would be similar to what D31963 was doing when promoting during LTO Unit splitting. This was an existing bug for COFF with ThinLTO. Interestingly, it would have been hit before except that note the above value has type metadata. This resulted in the splitting kicking in. I'm not sure the full set of conditions where we get type metadata, the places I was aware of in the past were when compiling with either -fsanitize or -fwhole-program-vtables, which comment #1 says is not the case here. |
By the way, I'm still testing our matrix of supported build configs, but -fsplit-lto-unit got me a successful build at least in the first configuration that I tried. Thanks for the tip. |
Just as another data point: by coincidence a colleague was attempting to enable -fsanitize=cfi and ran into a similar error message even with 7.0.1. That would seem to support your diagnosis. |
Actually with CFI enabled I would expect it to be more likely to take the path that hides the bug, so I'm not sure what this means. But in any case, I have a fix for the original reproducer. It is pretty simple but I didn't have a chance to tackle this until late yesterday. Including it below so you can try it out when you have a chance. I just need to add some comments and a test case before I send for review early next week. diff --git a/include/llvm/Transforms/Utils/FunctionImportUtils.h b/include/llvm/Transforms/Utils/FunctionImportUtils.h
diff --git a/lib/Transforms/Utils/FunctionImportUtils.cpp b/lib/Transforms/Utils/FunctionImportUtils.cpp
@@ -280,6 +285,14 @@ void FunctionImportGlobalProcessing::processGlobalsForThinLTO() {
} bool FunctionImportGlobalProcessing::run() { |
I applied the patch to 8.0.0rc1, and it looks good with the latest mozilla-central tree, at least with the original build configuration that prompted this bug. Should I try the CFI build too? |
Great!
Sure, let me know what happens. I sent the fix for review just now at D57395. |
obj file from CFI build LLVM ERROR: Associative COMDAT symbol '??$endl@DU?$char_traits@D@std@@@std@@YAAEAV?$basic_ostream@DU?$char_traits@D@std@@@0@AEAV10@@z' does not exist. The flags included: -fsanitize=cfi-icall This happened on a small helper binary so it can be reproduced with just the attached .obj file. Link with: lld-link -dll -nodefaultlib |
This turns out to be a separate instance of renaming provoking this issue. In this case, the module was split due to CFI, and no renaming/promotion was required since the comdat and the leader are in the same module. However, in LowerTypeTests.cpp a ".cfi" suffix is added to the function definition which is the comdat leader, provoking the same error. This is an existing issue unrelated to the one provoked by changing the splitting default. Probably needs a separate bug to track. You could cc peter@pcc.me.uk and myself. |
Sounds good; I opened bug 40540 for the CFI piece. |
Fixed in r352763. |
Reverted due to bot failures and recommitted with fix in r352770. Looks like it will stick this time. |
I've merged that to the llvm 8 branch in r352856. Please let me know if there are any follow-ups. |
Extended Description
After r350948, a Win64 Firefox ThinLTO build fails with:
LLVM ERROR: Associative COMDAT symbol '??_7KeywordsSink@?A0x23271001@@6b@' does not exist
(aka const
anonymous namespace'::KeywordsSink::
vftable')Notably it started failing at r350948 and not r340949. I would have thought the first patch of that pair was pretty innocuous.
Our flags do not include any of: -f[no]split-lto-unit, -fsanitize=cfi, or -fwhole-program-vtables.
In case it's a clue, the KeywordsSink class is only used within a single translation unit: https://searchfox.org/mozilla-central/search?q=KeywordsSink
I'm requesting this to be an 8.0.0 blocker as this build config worked for us in 7.0.1.
I can provide a linkrepro upon request though I'm not sure if that would be sufficient to investigate this. Let me know if I can help with the debug somehow. Thanks!
The text was updated successfully, but these errors were encountered: