-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lookup ccall
symbols in internal libraries first
#49010
Conversation
93580e2
to
02a7ec1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the if
statements everywhere make me think perhaps we should just have jl_dlsym()
take a const char * version
argument and use version == NULL
as the typical jl_dlsym()
case, and version != NULL
as the jl_dlvsym()
case. Taking a brief look at the glibc code, I don't think it's legal to pass version == NULL
, so we should be okay to use that as a sentinel value for the non-versioned case.
That should clean up a lot of the usage here. If we want to avoid ABI breakage, we could just renamed jl_dlsym
to jl_dlvsym
add the version
parameter, then add a new jl_dlsym
that just calls jl_dlvsym
with version
set to NULL
.
Then in codegen and everywhere else, we only emit calls to jl_dlvsym
with version
set to NULL
in all non-versioned cases.
@topolarity @vtjnash thoughts?
Sounds like a decent plan to me! I'll make that change tomorrow unless any objections are raised before then. |
Would it make sense to expose the newly created |
f32da50
to
d1cf142
Compare
Heads up that this change is now ABI-breaking for the (internal) symbols Public ABI has new symbol
Sounds like a good idea to me, but I'd like to keep this PR minimal for now - Could you file an issue for your use case? We can discuss the details about how to expose versioned ccalls there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I'd like to get @vtjnash's input
I spoke about this with Jameson, and he is curious to see if we can avoid surfacing Lines 63 to 74 in ceafd6c
ccall() or dlsym() lookups in Julia.
|
It seems more breaking to me not to correctly support resolving Julia-exported symbols in the "local" runtime without a handle. I'd prefer expanding the existing code where we intercept the symbol resolution to support this, which is exactly where #49012 currently adds a version-based lookup. I think this in principle would work with just Unfortunately, when I tested yesterday that doesn't work, and I don't understand why not. I'll have to dedicate some time to investigate with |
Well, I kind of agree with Jameson that the "no library ccall" form is kind of a mistake. I know we can't outright remove it from the language, but I'm more okay with saying that piece of functionality is not supported when running with multiple system images, since that's a somewhat new way of running Julia. |
I guess.. But it's bad sign to me that only 3 of the 24 examples in the ccall docs would meet that requirement, since "library name as a string" wouldn't count either. My two cents on the matter (which admittedly doesn't mean much) is that we should take an opportunity to support our usual feature-set for this use-case if we can. Anything less means dividing the ecosystem into "running Julia as a library works, except..." and fragmenting the Julia ecosystem into packages that have/been tested/optimized for a particular use case and those that haven't. I know that's already the reality of our statically-compiled ecosystem, but I think it's important to make positive progress towards provide better (simpler) guarantees about supported features, not splitting support across more use cases (even if they are arguably "niche"). |
Sorry if I'm being dense, but why wouldn't providing the library name work? Are we expecting to have multiple libraries with the same name but different symbol versions loaded? |
I think it depends on what you mean by name. In the case of Julia-in-Julia, the SONAME of the libjulia's that we're loading simultaneously is unique, but each version still has a symlink "libjulia-internal.so" that in principle could be found by I'm not sure whether that works out-of-the-box or requires some extra resolution shims to handle correctly, but otherwise yeah I think we could make deprecate only If we're ready to deprecate that functionality across the board, then I'm happy to leave it out. My main point is to support the feature at least until we're ready to take that step, since at the end of the day it's a similar resolution shim. |
I think intercepting the “no library” case and injecting libjulia first is good. It is what we require on Windows and for |
Okay, good news! I was able to get the Windows-style lookup-interception working. The new change looks like:
One caveat is that relying on the (non-)existence of a symbol for, e.g., version detection does not work. Any look-up that fails against the "local" libjulia can still succeed when continuing to the global symbol resolution. If that approach sounds good to you @vtjnash I can update this PR. #49012 would be essentially unchanged, since the symbol versioning is required anyway to get the |
d1cf142
to
7c64549
Compare
dlvsym
support for ccall'sccall
symbols in internal libraries first
7c64549
to
ef65ab5
Compare
ef65ab5
to
5d7cb70
Compare
This is similar to what we do on Windows, where handles to the exact library are required anyway. This is enough to make sure that our `dlsym` lookups are directed to the correct libjulia, even when loading a Julia run-time within Julia. The second change needed to get things working (not included in this commit) is to add symbol versioning, so that the runtime linker does not mix up symbols between the two libraries.
5d7cb70
to
95f4376
Compare
This prevents the library from being upgraded to the global namespace for symbol resolution. We were already making this mistake before this function was introduced, but probably did not notice it because RTLD_LOCAL is the default behavior on Linux. On the other hand, macOS does need this fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSLGTM at Elliot's request
It would probably have been good to run PkgEval here? The PR has broken CBinding.jl, which is used by several packages (BPFNative.jl, System.jl, XXhash.jl, PLCTag.jl, ...):
CBinding is calling |
Okay, but "this is a risky hack" does not inspire much confidence. We do not support that because it is not correct to specify that on Windows, so it is not cross-platform compliant. |
@@ -255,26 +275,6 @@ JL_DLLEXPORT void *jl_load_dynamic_library(const char *modname, unsigned flags, | |||
int n_extensions = endswith_extension(modname) ? 1 : N_EXTENSIONS; | |||
int ret; | |||
|
|||
/* | |||
this branch returns handle of libjulia-internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was actually trying to trigger this case however, which is not as much of a trick and not quite what it claimed to be doing. I assume we could put this back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it, yes. Putting it back makes CBinding work again, as expected. @topolarity what was the reason to remove this functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a mistake and assumed that downstream code would not end up here after all of the runtime-internal uses were re-factored into jl_find_library_by_addr
.
PR open to fix: #49611
Can I get more info on why this should go into 1.9? Is there something that is blocked waiting on it? |
This change lays the necessary groundwork to support performing versioned symbol lookups usingdlvsym
.This is a pre-requisite for symbol-versioning Julia's internal libraries, and the new code paths here won't actually be exercised until a follow-up is landed to turn on symbol-versioning.This is similar to what we do on Windows, where handles to the exact library are required anyway. This is enough to make sure that our
dlsym
lookups are directed to the correct libjulia, even when loading a Julia run-time within Julia.