Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove literal_static_pointer_val from literal_pointer_val implementation #50632

Merged
merged 3 commits into from
Jul 25, 2023

Conversation

pchintalapudi
Copy link
Member

This eliminates most of the literal pointers outside of ccalls and directly JIT-ted code, except instead of adding names we instead just use imaging mode and fix up the visualization in jl_get_llvmf_defn.

Depends on #50631

@pchintalapudi pchintalapudi added compiler:codegen Generation of LLVM IR and native code compiler:llvm For issues that relate to LLVM labels Jul 22, 2023
Base automatically changed from pc/dce-first to master July 23, 2023 17:27
src/ccall.cpp Outdated Show resolved Hide resolved
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
@pchintalapudi pchintalapudi added the merge me PR is reviewed. Merge when all tests are passing label Jul 24, 2023
@vchuravy
Copy link
Member

cc: @maleadt

Have you tested GPUCompiler against this?

@pchintalapudi
Copy link
Member Author

Have not tested GPUCompiler, though whatever method GPUCompiler is currently using to handle global_targets should still apply here.

@gbaraldi
Copy link
Member

It's failing the literal string comparison tests but it seems to work .

@vtjnash vtjnash merged commit 7ca0f0d into master Jul 25, 2023
@vtjnash vtjnash deleted the pc/jit-literals branch July 25, 2023 19:54
@pchintalapudi pchintalapudi removed the merge me PR is reviewed. Merge when all tests are passing label Jul 25, 2023
@maleadt
Copy link
Member

maleadt commented Jul 27, 2023

This broke RuntimeGeneratedFunctions.jl: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_date/2023-07/26/RuntimeGeneratedFunctions.primary.log

Program aborted due to an unhandled Error:
Symbols not found: [ *Core.getfield#1 ]

[9] signal (6.-6): Aborted
in expression starting at /home/pkgeval/.julia/packages/RuntimeGeneratedFunctions/Avj6S/test/runtests.jl:152
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_ZNK4llvm5Error19fatalUncheckedErrorEv at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
assertIsChecked at /source/usr/include/llvm/Support/Error.h:264 [inlined]
~Error at /source/usr/include/llvm/Support/Error.h:226
~JITSymbol at /source/usr/include/llvm/ExecutionEngine/JITSymbol.h:322 [inlined]
operator() at /source/src/jitlayers.cpp:1510 [inlined]
withModuleDo<JuliaOJIT::addModule(llvm::orc::ThreadSafeModule)::<lambda(llvm::Module&)> > at /source/usr/include/llvm/ExecutionEngine/Orc/ThreadSafeModule.h:136 [inlined]
addModule at /source/src/jitlayers.cpp:1492
jl_add_to_ee at /source/src/jitlayers.cpp:1964

Maybe related to opaque closures? I'll see if I can come up with a minimal reproducer.

EDIT: Diffractor.jl has the same issue (cc @oxinabox).

@maleadt
Copy link
Member

maleadt commented Jul 27, 2023

MWE:

@generated function foo()
    quote
        $(Symbol("wat"))
        $(Expr(:opaque_closure, :(() -> bar)))
    end
end
foo()

Requires Julia#master with LLVM assertions.

@maleadt
Copy link
Member

maleadt commented Jul 27, 2023

Doesn't reproduce on very latest master though, I guess #50650 may have fixed this?
EDIT: RGF works now, but Diffractor still doesn't.

@pchintalapudi
Copy link
Member Author

pchintalapudi commented Jul 28, 2023

This is unfortunately not a convenient missing symbol, but is a memory corruption somewhere. We seem to be allocating two llvm::GlobalVariable objects at the same address (testing on master before #50650 with the MWE), and I can't debug further because GlobalVariable overrides operator new and has its own custom allocator, which removes the normal forms of memory debugging.

@gbaraldi
Copy link
Member

Are we freeing the GlobalVariable objects somewhere?

@pchintalapudi
Copy link
Member Author

No, hardware watch doesn't detect an intervening destructor call.

@Keno
Copy link
Member

Keno commented Jul 28, 2023

Possibly the same issue as #48742?

@pchintalapudi
Copy link
Member Author

That does seem likely, since we'd be freeing the associated module and potentially the globalvariable there too. @vtjnash the WeakVH is too expensive, but can we just make global_targets a map of void* -> std::string? Alternatively we could generate a new private global every time we call julia_pgv, and then mark it up with information in the GV attributes that allows us to merge duplicate globals post-codegen.

@vtjnash
Copy link
Member

vtjnash commented Jul 28, 2023

That code really just should not call jl_merge_module, since we don't support that. The output of codegen is supposed to be the result of doing emission of one thing into a single module, so that we can handle errors and incremental work.

@maleadt
Copy link
Member

maleadt commented Jul 31, 2023

I'm also getting the following error on master with Metal.jl:

error: Linking globals named '*Core.Intrinsics.bitcast#1': symbol multiply defined!

Reverting to before this PR fixes that, although another PR may be the actual cause of course.

@pchintalapudi
Copy link
Member Author

Does it still fail on #50724 ?

@maleadt
Copy link
Member

maleadt commented Jul 31, 2023

Does it still fail on #50724 ?

Yeah.

EDIT: this is when GPUCompiler invokes the linker; I'll have a closer look myself first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code compiler:llvm For issues that relate to LLVM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants