Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move more of rustc_llvm to upstream LLVM #46437

Open
alexcrichton opened this issue Dec 1, 2017 · 9 comments
Open

Move more of rustc_llvm to upstream LLVM #46437

alexcrichton opened this issue Dec 1, 2017 · 9 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. S-tracking-impl-incomplete Status: The implementation is incomplete. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Dec 1, 2017

In general we try to use the LLVM C API whenever we can as it's generally nice and stable. It also has the great benefit of being maintained by LLVM so it tends to never be a pain point when upgrading LLVM! Unfortunately though LLVM's C API isn't 100% comprehensive and we often need functionality above and beyond what you can do with just C.

For this custom functionality we typically use the C++ API of LLVM and compile our own shims which then in turn have a C API. At the time of this writing all of the C++ to C shims are located in the src/rustllvm directory across three main files: ArchiveWrapper.cpp, PassWrapper.cpp, and RustWrapper.cpp. These files are all compiled via build.rs around here where basically use llvm-config to guide us in how to compile those files.

The downside of these shims that we have, however, is that they're difficult for us to maintain over time. They impose problems whenever we upgrade LLVM (we have to get them compiling again as the C++ APIs change quite regularly). Additionally it also makes consumers of Rust have a more difficult time using custom LLVM versions. For example right now our shims compile on LLVM 5 but probably not LLVM trunk. Additionally for users that like to follow LLVM trunk then keeping up with the breakage of our shims can be quite difficult!

To help solve this problem it seems the ideal solution is to try to upstream at least a big chunk of the C++ APIs that we're using. This way we can much more closely stick to LLVM's C API which is far more stable. It makes it that much easier for us to eventually upgrade LLVM and it makes users using a custom LLVM not need to worry about using an LLVM beyond the one that we're using (aka LLVM trunk).

I'll try to have a checklist here we can maintain over time which also is a good listing of what each of the APIs does!

ArchiveWrapper.cpp

In general this is functionality for reading archive *.a files in the Rust compiler. This makes reading rlibs (which are archive files) extra speedy. The functions here are:

  • LLVMRustOpenArchive
  • LLVMRustDestroyArchive
  • LLVMRustArchiveIteratorNew
  • LLVMRustArchiveIteratorFree
  • LLVMRustArchiveIteratorNext
  • LLVMRustArchiveChildName
  • LLVMRustArchiveChildData
  • LLVMRustArchiveChildFree
  • LLVMRustArchiveMemberNew
  • LLVMRustArchiveMemberFree
  • LLVMRustWriteArchive

These functions are basically just reading and writing archives, using iterators for reading and providing a list of structs for writing.

PassWrapper.cpp

This file is when we get into a bit more of a smorgasboard of random functions rather than a consistent theme, so I'll comment more of them inline below.

A general theme here I've found as I wrote these down is that it's not critical that all of these are implemented. I could imagine that it would be possible to have a mode where we as rustc still compile shims sometimes (like the ones below) but many of the shims are stubbed out to not actually use LLVM at all if we're in "non-Rust-LLVM mode" (aka custom LLVM mode). In other words, we don't necessarily need to upstream 100% of these functions.

  • LLVMInitializePasses - not entirely sure why we can't use the upstream versions. Someone more knowledgeable with LLVM may know how to replace this!
  • LLVMRustFindAndCreatePass - this is how we add custom passes to a pass manager by their string name
  • LLVMRustPassKind - categorizes whether a pass is a function or module pass
  • LLVMRustAddPass - add a custom pass to a pass manager
  • LLVMRustPassManagerBuilderPopulateThinLTOPassManager - thin wrapper around the C++ API to populate a ThinLTO pass manager
  • LLVMRustHasFeature - this is actually a pretty tricky one. It has to do with target_feature requires embedded LLVM copy to be usable #46181 and is I think the only function which actually only works with our fork. I can provide more information for this if necessary.
  • LLVMRustPrintTargetCPUs - mostly just a debugging helper we could stub out in the custom LLVM case.
  • LLVMRustPrintTargetFeatures - same as above
  • LLVMRustCreateTargetMachine - this is one we have to create a TargetMachineRef ourselves but also giving us full access to all the fields, would probably just involve exposing more field accessors and setters and such.
  • LLVMRustDisposeTargetMachine - complement to the above
  • LLVMRustAddAnalysisPasses - I think this is just adding "standard" passes to the pass manager IIRC, we're just trying to mirror what clang is doing here.
  • LLVMRustConfigurePassManagerBuilder - just configuring some fields, again also aimed at mirroring clang.
  • LLVMRustAddBuilderLibraryInfo - again, attempting to mirror clang by configuring all the fields
  • LLVMRustAddLibraryInfo - mirroring clang
  • LLVMRustRunFunctionPassManager - seems ripe to add upstream!
  • LLVMRustSetLLVMOptions - I think this is for one-time configuration of LLVM at startup
  • LLVMRustWriteOutputFile - there's a whole bunch of ways to write outupt files with LLVM, if we had something that just wrote it out to memory or a file that'd be good enough for us
  • LLVMRustPrintModule - I'm pretty sure this is mainly just generating IR, but I'm not personally too familiar with the need for a custom class here
  • LLVMRustPrintPasses - AFAIK a debugging helper, could be stubbed out with a custom LLVM
  • LLVMRustAddAlwaysInlinePass - may just be missing upstream?
  • LLVMRustRunRestrictionPass - I think this is part of our LTO bindings, internalizing lots of stuff
  • LLVMRustMarkAllFunctionsNounwind - definitely part of our LTO bindings, for when you're compiling with -C lto and -C panic=abort
  • LLVMRustSetDataLayoutFromTargetMachine - not entirely sure what this is...
  • LLVMRustGetModuleDataLayout - also not entirely sure what this is...
  • LLVMRustSetModulePIELevel - I think just configuring more properties
  • LLVMRustThinLTOAvailable - for us just testing the LLVM version right now
  • LLVMRustWriteThinBitcodeToFile - mostly just what it says on the tin
  • LLVMRustThinLTOBufferCreate - same as abvoe but in memory
  • LLVMRustThinLTOBufferFree - freeing the above
  • LLVMRustThinLTOBufferPtr - reading the above
  • LLVMRustThinLTOBufferLen - reading the above
  • LLVMRustParseBitcodeForThinLTO - mostly what it says on the tin

These APIs are all related to ThinLTO are are still somewhat in flux, there may not be a great C API just yet.

  • LLVMRustCreateThinLTOData
  • LLVMRustFreeThinLTOData
  • LLVMRustPrepareThinLTORename
  • LLVMRustPrepareThinLTOResolveWeak
  • LLVMRustPrepareThinLTOInternalize
  • LLVMRustPrepareThinLTOImport

RustWrapper.cpp

Sort of even a bigger smorgasboard than PassWrapper.cpp! Note that many of these functions are very old and may have actually made their way into the C API of LLVM by now, in which case that'd be awesome!

  • LLVMRustCreateMemoryBufferWithContentsOfFile - this is something we can and probably should write ourselves rather than relying on LLVM
  • LLVMRustGetLastError - this is a Rust-specific API for getting out an error message, I'd imagine that whenever it's set we'd have something analagous in LLVM.
  • LLVMRustSetLastError - used by the C++ code to set the error that rustc will retrieve later
  • LLVMRustSetNormalizedTarget - I think this is just exposing something that wasn't already there.
  • LLVMRustPrintPassTimings - debugging on our end.
  • LLVMRustGetNamedValue - I think this is just fun dealing with metadata
  • LLVMRustGetOrInsertFunction - needed that C++ function most likely.
  • LLVMRustGetOrInsertGlobal - again, probably just needed the function
  • LLVMRustMetadataTypeInContext - more constructors for more types
  • LLVMRustAddCallSiteAttribute - just a "fluff" thing we needed to do that wasn't possible in C IIRC
  • LLVMRustAddAlignmentCallSiteAttr - same as above
  • LLVMRustAddDereferenceableCallSiteAttr - same as above
  • LLVMRustAddDereferenceableOrNullCallSiteAttr - same as above
  • LLVMRustAddFunctionAttribute - same as above
  • LLVMRustAddAlignmentAttr - same as above
  • LLVMRustAddDereferenceableAttr - same as above
  • LLVMRustAddDereferenceableOrNullAttr - same as above
  • LLVMRustAddFunctionAttrStringValue - same as above
  • LLVMRustRemoveFunctionAttributes - same as above
  • LLVMRustSetHasUnsafeAlgebra - not entirely sure what this is doing...
  • LLVMRustBuildAtomicLoad - I think at the time the C API didn't exist?
  • LLVMRustBuildAtomicStore - same as above
  • LLVMRustBuildAtomicCmpXchg - same as above
  • LLVMRustBuildAtomicFence - same as above
  • LLVMRustSetDebug - I think one-time configuration of LLVM
  • LLVMRustInlineAsm - I think the C API didn't exist (or wasn't full-featured enough)
  • LLVMRustAppendModuleInlineAsm - that function probably wasn't exposed in C
  • LLVMRustVersionMinor - just exposing a constant
  • LLVMRustVersionMajor - same as above
  • LLVMRustDebugMetadataVersion - this and most debug functions below I think just aren't in the C API
  • LLVMRustAddModuleFlag - same as above
  • LLVMRustMetadataAsValue - same as above
  • LLVMRustDI* - same as above (there's a whole bunch of these)
  • LLVMRustWriteValueToString - IIRC this is mostly debugging
  • LLVMRustLinkInExternalBitcode - used during normal LTO
  • LLVMRustLinkInParsedExternalBitcode - used during normal LTO
  • LLVMRustGetSectionName - not sure where this came from...
  • LLVMRustArrayType - missing C API?
  • LLVMRustWriteTwineToString - I think more debugging/diagnostics
  • LLVMRustUnpackOptimizationDiagnostic - diagnostics
  • LLVMRustUnpackInlineAsmDiagnostic - diagnostics
  • LLVMRustWriteDiagnosticInfoToString - diagnostics
  • LLVMRustGetDiagInfoKind - custom for us I think?
  • LLVMRustGetTypeKind - missing C API?
  • LLVMRustWriteDebugLocToString - debugging API I think
  • LLVMRustSetInlineAsmDiagnosticHandler - dealing with inline asm diagnostics
  • LLVMRustWriteSMDiagnosticToString - diagnostics
  • LLVMRustBuildLandingPad - missing C API?
  • LLVMRustBuildCleanupPad - same as above
  • LLVMRustBuildCleanupRet - same as above
  • LLVMRustBuildCatchPad - same as above
  • LLVMRustBuildCatchRet - same as above
  • LLVMRustBuildCatchSwitch - same as above
  • LLVMRustAddHandler - same as above
  • LLVMRustBuildOperandBundleDef - same as above
  • LLVMRustBuildCall - same as above
  • LLVMRustBuildInvoke - same as above
  • LLVMRustPositionBuilderAtStart - same as above I think?
  • LLVMRustSetComdat - same as above
  • LLVMRustUnsetComdat - same as above
  • LLVMRustGetLinkage - same as above
  • LLVMRustSetLinkage - same as above
  • LLVMRustConstInt128Get - same as above
  • LLVMRustGetValueContext - same as above
  • LLVMRustGetVisibility - same as above
  • LLVMRustSetVisibility - same as above
  • LLVMRustModuleBufferCreate - serializing a module to memory
  • LLVMRustModuleBufferFree - freeing above
  • LLVMRustModuleBufferPtr - reading above
  • LLVMRustModuleBufferLen - reading above
  • LLVMRustModuleCost - mostly a debugging helper
@alexcrichton alexcrichton added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Dec 1, 2017
@alexcrichton
Copy link
Member Author

For ArchiveWrapper.cpp a fun project could also be trying to use a crate on crates.io instead!

dotdash added a commit to dotdash/rust that referenced this issue Jan 4, 2018
The function was added as a wrapper to handle compatibility with older
LLVM versions that we no longer support, so it can be removed.

Refs rust-lang#46437
dotdash added a commit to dotdash/rust that referenced this issue Jan 4, 2018
steveklabnik added a commit to steveklabnik/rust that referenced this issue Jan 4, 2018
Remove some outdated LLVM-related code

Ticks two boxes on rust-lang#46437
kennytm added a commit to kennytm/rust that referenced this issue Jan 4, 2018
Remove some outdated LLVM-related code

Ticks two boxes on rust-lang#46437
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jan 6, 2018
Remove some outdated LLVM-related code

Ticks two boxes on rust-lang#46437
dotdash added a commit to dotdash/rust that referenced this issue Jan 7, 2018
The same effect can be achieved using -Cllvm-args=-debug

Refs rust-lang#46437 as it removes LLVMRustSetDebug()
dotdash added a commit to dotdash/rust that referenced this issue Jan 7, 2018
dotdash added a commit to dotdash/rust that referenced this issue Jan 7, 2018
Refs rust-lang#46437 as it also removes LLVMRustWriteDebugLocToString()
kennytm added a commit to kennytm/rust that referenced this issue Jan 8, 2018
Remove unused LLVM related code

Ticks a few more boxes on rust-lang#46437
@XAMPPRocky XAMPPRocky added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Feb 26, 2018
@djc
Copy link
Contributor

djc commented Oct 3, 2018

What's the status with LLVM upstream these days? How compatible is Rust with upstream LLVM?

@alexcrichton
Copy link
Member Author

We track upstream pretty closely, but all of the above bindings occasionally require updates to make sure we compile against upstream. To that end it'd still be best to upstream this!

@alexcrichton
Copy link
Member Author

cc @petrhosek

we were talking about this at rustconf, and while dated it still has some good information I think!

@inglorion
Copy link
Contributor

I'm interested in working on this. Besides the benefit to Rust, I also think LLVM benefits from having a more complete C API. Since this has been open for a while: Are there any parts of this that are no longer relevant and I can avoid spending time on?

@arrowd
Copy link

arrowd commented May 6, 2022

Any progress on this? Downstream rust packagers will also greatly benefit from this by the means of much shorter build times.

@bjorn3
Copy link
Member

bjorn3 commented May 6, 2022

Why would it result in much shorter build times? These shims are small compared to the rest of rustc and LLVM and work with precompiled LLVM distributions too, not just the rust fork of LLVM.

@arrowd
Copy link

arrowd commented May 6, 2022

Oh, sorry, I thought it is a blocker for using precompiled LLVM.

@pnkfelix
Copy link
Member

pnkfelix commented May 6, 2022

Visiting for T-compiler backlog bonanza.

Seems like a good list (if long) of relatively simple tasks.

@rustbot label: S-tracking-impl-incomplete

@rustbot rustbot added the S-tracking-impl-incomplete Status: The implementation is incomplete. label May 6, 2022
@wesleywiser wesleywiser added C-cleanup Category: PRs that clean code up or issues documenting cleanup. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. labels Jan 5, 2024
@workingjubilee workingjubilee changed the title Move more of src/rustllvm to upstream LLVM Move more of rustc_llvm to upstream LLVM Oct 18, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Oct 19, 2024
…r=Zalathar

compiler: Use LLVM's Comdat support

Acting on these long-ago issues:
- rust-lang#46437
- rust-lang#68955
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 19, 2024
…r=Zalathar

compiler: Use LLVM's Comdat support

Acting on these long-ago issues:
- rust-lang#46437
- rust-lang#68955
Zalathar added a commit to Zalathar/rust that referenced this issue Oct 20, 2024
…r=Zalathar

compiler: Use LLVM's Comdat support

Acting on these long-ago issues:
- rust-lang#46437
- rust-lang#68955
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Oct 20, 2024
Rollup merge of rust-lang#131876 - workingjubilee:llvm-c-c-c-comdat, r=Zalathar

compiler: Use LLVM's Comdat support

Acting on these long-ago issues:
- rust-lang#46437
- rust-lang#68955
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Oct 22, 2024
compiler: Use LLVM's Comdat support

Acting on these long-ago issues:
- rust-lang/rust#46437
- rust-lang/rust#68955
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. S-tracking-impl-incomplete Status: The implementation is incomplete. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants