You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LLVM has an IR Mangler class, which has two responsibilities:
Adding underscore prefixes to symbols on some platforms (MachO & x86 Windows)
Mangling wacky Windows calling conventions (stdcall, vectorcall, fastcall)
I believe the first behavior is useful, because it makes simple LLVM IR more portable. It papers over a common portability issue for frontends. They can easily call the platform C library (main, malloc, free, printf, puts) with the same IR on all platforms. It makes writing generic IR tests for all platforms easier, since you don't have to change IR function names based on the target.
The second behavior, however, is very specific to Windows APIs, and doesn't help make target-neutral IR portable. I think it was a mistake to implement this behavior in LLVM. This should have always been a frontend responsibility. I think it became the IR mangler's responsibility to do it because, for fastcall, the leading Windows underscore changes to an @ symbol.
The calling convention mangling has a history of causing issues, the latest of which is D120382. For this reason, Clang implements the calling convention mangling in the frontend.
For reference, here are the three Windows calling convention manglings that get byte suffixes:
Assuming folks agree that this is a desirable direction, we should come up with a migration plan. I think the simplest thing to do is to add a static method to the Mangler class that accepts a calling convention, base name, and llvm::FunctionType and returns a string mangled name. Frontends can call this API directly if they need mangling when constructing LLVM IR. Clang handles this itself to handle corner cases involving incomplete types, which probably aren't a concern for non-C frontends.
The most straightforward way to prevent the IR mangler from prefixing these fastcall and vectorcall symbols with underscores would be to prefix them with the existing IR mangler escape, \01. This would presumably make IR from such a frontend incompatible with old IR for LTO purposes, but it matches clang's behavior today, so in some ways it is more compatible with the dominant existing IR producer.
Another way to handle this would be to extend the current mangler check for question mark prefixes to look for @ anywhere in the symbol. This is a behavior change for frontends that create symbols with @ in them, and any language that uses that character for its symbols will need to prefix such symbols with _ to avoid having an ABI break at the object file level. This would also break LTO with old bitcode since the IR symbols would change. However, we went through the same IR ABI break when we eliminated the \01 escape from all C++ symbols emitted by Clang, and I don't recall any user-reported issues, just challenges with updating IR test case expectations.
I think the first approach, \01 prefixing, is a better transition path. We can add the API (Mangler::mangleForCallingConvention(StringRef, DataLayout, CallingConv::ID, FunctionType)), announce the deprecation plan, migrate frontends, and remove the old mangler behavior one release later.
The text was updated successfully, but these errors were encountered:
Yes, that's right. I think it would be sufficient to also pass the AttributeList for the function prototype, which frontends have to provide eventually anyway. The other option is to provide an API that takes a fully formed llvm::Function, but then the frontend has to rename functions after they are created.
LLVM has an IR Mangler class, which has two responsibilities:
I believe the first behavior is useful, because it makes simple LLVM IR more portable. It papers over a common portability issue for frontends. They can easily call the platform C library (
main
,malloc
,free
,printf
,puts
) with the same IR on all platforms. It makes writing generic IR tests for all platforms easier, since you don't have to change IR function names based on the target.The second behavior, however, is very specific to Windows APIs, and doesn't help make target-neutral IR portable. I think it was a mistake to implement this behavior in LLVM. This should have always been a frontend responsibility. I think it became the IR mangler's responsibility to do it because, for fastcall, the leading Windows underscore changes to an
@
symbol.The calling convention mangling has a history of causing issues, the latest of which is D120382. For this reason, Clang implements the calling convention mangling in the frontend.
For reference, here are the three Windows calling convention manglings that get byte suffixes:
Assuming folks agree that this is a desirable direction, we should come up with a migration plan. I think the simplest thing to do is to add a static method to the Mangler class that accepts a calling convention, base name, and
llvm::FunctionType
and returns a string mangled name. Frontends can call this API directly if they need mangling when constructing LLVM IR. Clang handles this itself to handle corner cases involving incomplete types, which probably aren't a concern for non-C frontends.The most straightforward way to prevent the IR mangler from prefixing these fastcall and vectorcall symbols with underscores would be to prefix them with the existing IR mangler escape,
\01
. This would presumably make IR from such a frontend incompatible with old IR for LTO purposes, but it matches clang's behavior today, so in some ways it is more compatible with the dominant existing IR producer.Another way to handle this would be to extend the current mangler check for question mark prefixes to look for
@
anywhere in the symbol. This is a behavior change for frontends that create symbols with@
in them, and any language that uses that character for its symbols will need to prefix such symbols with_
to avoid having an ABI break at the object file level. This would also break LTO with old bitcode since the IR symbols would change. However, we went through the same IR ABI break when we eliminated the\01
escape from all C++ symbols emitted by Clang, and I don't recall any user-reported issues, just challenges with updating IR test case expectations.I think the first approach,
\01
prefixing, is a better transition path. We can add the API (Mangler::mangleForCallingConvention(StringRef, DataLayout, CallingConv::ID, FunctionType)
), announce the deprecation plan, migrate frontends, and remove the old mangler behavior one release later.The text was updated successfully, but these errors were encountered: