Add load_libjulia in libjulialoader #37779

tkf · 2020-09-28T03:46:59Z

From the discussion in #36588 (comment), I'd imagine we need something like void * load_libjulia(const char *) for loading libjulia in a cross-platform manner.

This is a dead-simple implementation that just extracts out the first lines of load_repl. @staticfloat If you have a better idea of doing this, please feel free to close this and implement it in a new PR.

Here is a simple Python session with this PR:

In [1]: import ctypes
   ...: libjulialoader = ctypes.CDLL("usr/lib/libjulialoader.so")
   ...: libjulialoader.load_libjulia.restype = ctypes.c_void_p
   ...: libjulialoader.load_libjulia(b"usr/bin")
Out[1]: 94362114802096

In [2]: libjulia = ctypes.PyDLL("usr/lib/libjulia.so", ctypes.RTLD_GLOBAL)

In [3]: libjulia._handle
Out[3]: 94362114802096

In [4]: Out[1] == Out[3]
Out[4]: True

Question: Should it be renamed to jl_load_libjulia or something?

cc @davidanthoff @GunnarFarneback

yuyichao

This API is still fundamentally broken. It should not be required for the user to pass in the path to the julia executable since that's extremely error-prone and may be impossible.

Ref #36588 (comment)

staticfloat

This is exactly how I would have done it.

tkf · 2020-09-28T04:43:14Z

I noticed that calling load_libjulia the second time returns a different handle (but third and later handles are identical to the second one):

In [1]: import ctypes
   ...: libjulialoader = ctypes.CDLL("usr/lib/libjulialoader.so")
   ...: libjulialoader.load_libjulia.restype = ctypes.c_void_p
   ...: libjulialoader.load_libjulia(b"usr/bin")
Out[1]: 94264976507248

In [2]: libjulialoader.load_libjulia(b"usr/bin")
Out[2]: 94264976504784

In [3]: libjulialoader.load_libjulia(b"usr/bin")
Out[3]: 94264976504784

In [4]: libjulialoader.load_libjulia(b"usr/bin")
Out[4]: 94264976504784

In [5]: libjulia = ctypes.PyDLL("usr/lib/libjulia.so", ctypes.RTLD_GLOBAL)

In [6]: libjulia._handle  # same as Out[1]
Out[6]: 94264976507248

I'm testing this on Linux. I thought dlopen would return the same handle? Is it expected?

tkf · 2020-09-28T04:44:23Z

Ah, I guess this is because

julia/cli/loader_lib.c

Lines 146 to 147 in da2935d

    
           // Chop the string at the colon, load this library. 
        
           *colon = '\0';

mutates the global dep_libs in-place?

Edit: Ye, It looks like it. With this patch

diff --git a/cli/loader_lib.c b/cli/loader_lib.c
index ed989481b6..1b291bfb6c 100644
--- a/cli/loader_lib.c
+++ b/cli/loader_lib.c
@@ -40,6 +40,7 @@ static void * load_library(const char * rel_path, const char * src_dir) {
     strncat(path, src_dir, sizeof(path) - 1);
     strncat(path, PATHSEPSTRING, sizeof(path) - 1);
     strncat(path, rel_path, sizeof(path) - 1);
+    print_stderr3("`dlopen`ing: ", path, "\n");

     void * handle = NULL;
 #if defined(_OS_WINDOWS_)

I get

In [1]: import ctypes
   ...: libjulialoader = ctypes.CDLL("usr/lib/libjulialoader.so")

In [2]: libjulialoader.load_libjulia(b"usr/bin")
`dlopen`ing: usr/bin/../lib/libgcc_s.so.1
`dlopen`ing: usr/bin/../lib/libopenlibm.so.3
`dlopen`ing: usr/bin/../lib/libjulia.so.1.6
Out[2]: -1558041792

In [3]: libjulialoader.load_libjulia(b"usr/bin")
`dlopen`ing: usr/bin/../lib/libgcc_s.so.1
Out[3]: -1558044256

In [4]: libjulialoader.load_libjulia(b"usr/bin")
`dlopen`ing: usr/bin/../lib/libgcc_s.so.1
Out[4]: -1558044256

In [5]: libjulialoader.load_libjulia(b"usr/bin")
`dlopen`ing: usr/bin/../lib/libgcc_s.so.1
Out[5]: -1558044256

yuyichao · 2020-09-28T04:55:49Z

This is exactly how I would have done it.

And you have still not answered the question of how are the user supposed to find the binary reliably. And the motivation for making embedding much harder than necessary.

vtjnash · 2020-09-28T05:08:46Z

@yuyichao Yes, you've made your concerns clear, but realize that the work done here was arrived at after much analysis (and even a couple failed PRs). It was just getting too awkward for libLLVM to be loaded one way, while all our other dependent libraries get lazy loaded if possible.

While we realize some of this work is still incomplete, the majority of the foundational work is being finished with this PR to permit making it easier to link against libjulia, while yet enhancing our ability to select, load, and upgrade the dependent libraries in situ. Merging that PR was necessary to help unstick some other pending work, so we opted to merge it early to give an opportunity that all of the pieces get testing and fixes now (thus further in advance of the actual release branch date) instead of being blocked on getting testing of all pieces on all platforms.

yuyichao · 2020-09-28T05:26:49Z

arrived at after much analysis (and even a couple failed PRs)

Where are they? As I already mentioned, the one failed PR about adding an executable wrapper on windows does not seem to have any applicable objection on it. (All objection are related to the change of PATH which is AFAICT unnecessary).

It was just getting too awkward for libLLVM to be loaded one way, while all our other dependent libraries get lazy loaded if possible.

Not sure how this is related. Here I'm not even talking about lazy vs not. Everything done here are eagerly loaded (and for libjulia dependency that's totally fine) and it was eager as well so nothing was changed. AFAICT the libraries that currently get lazily loades will still remain to do so, so nothing should have changed about the lasiness either. There are and will always be two different ways to load thing, one that must happen before or at the same time libjulia is loaded, and one at runtime done lazily. I don't see anything here or else change this aspect. The current version is by design/intentionally using two version of code to open libraries as well and I don't see that as being any less awkward in this regard. If anything, it is much more awkward since now there are basically three mechanism to load/link libraries, one is the unavoidable system dynamic linker, one is the dlopen in libjulialoader, and one is the runtime dlopen in libjulia. So if the awkwardness of multiple different ways to load things is a concern, which I kind of agree, this (i.e. #36588) should not be done.

While we realize some of this work is still incomplete, the majority of the foundational work is being finished with this PR to permit making it easier to link against libjulia, while yet enhancing our ability to select, load, and upgrade the dependent libraries in situ. Merging that PR was necessary to help unstick some other pending work, so we opted to merge it early to give an opportunity that all of the pieces get testing and fixes now (thus further in advance of the actual release branch date) instead of being blocked on getting testing of all pieces on all platforms.

Exactly, the implementation has many other problems that I was not even commenting much on. Having other stuff pending on this does not mean the change is good to go. The implementation can be bad but the design has to be sound. Most of what I was focusing on, and the only point I mentioned here, are AFAICT fundamental problems tied to how this is designed, the API, and not about the implementation.

Here, I'm only asking about the public API change for embedding. Unless future progress will completely remove the exe_dir from the API, this is not at all an implementation detail issue and not something that can be fixed by adding more stuff to rely on this. And if exe_dir is going away, then this PR should not be merged since it'll only increase the API breakage.

staticfloat · 2020-09-28T06:10:47Z

Here, I'm only asking about the public API change for embedding. Unless future progress will completely remove the exe_dir from the API, this is not at all an implementation detail issue and not something that can be fixed by adding more stuff to rely on this. And if exe_dir is going away, then this PR should not be merged since it'll only increase the API breakage.

Let's focus in on exe_dir then; the design constraints are that we need a cross-platform way to load binaries that are located at an arbitrary location that is constant relative to the installation directory of Julia. In this case, the paths will be something like ${julia_install_root}/share/julia/stdlib/v1.6/artifacts/<hash>/lib/libLLVM.dll. What API would you suggest for allowing the library (libjulia or libjulialoader, or whatever entry point you prefer) to access these libraries?

yuyichao · 2020-09-28T06:18:48Z

What API would you suggest for allowing the library (libjulia or libjulialoader, or whatever entry point you prefer) to access these libraries?

Well, anything that does not require the user to specify additional path on the API, i.e. equivalent to specifying $ORIGIN/julia on libjulia now. This is the basic requirement of no functional regression without even talking about breakage. As I said, I'm totally fine with adding an API that is no-op on linux where things works correctly. That will break embedding API but isn't raising the requirement on embedding users (i.e. need to call more functions but no need to acquire more information than before).

Also I don't see why libraries linked to libjulia has to have a complicated path. It can be copied/linked to a simpler directory at build time. (i.e. I don't see why

the design constraints are that we need a cross-platform way to load binaries that are located at an arbitrary location that is constant relative to the installation directory of Julia. In this case

is a design constraint)

staticfloat · 2020-09-28T17:42:44Z

Well, anything that does not require the user to specify additional path on the API, i.e. equivalent to specifying $ORIGIN/julia on libjulia now. This is the basic requirement of no functional regression without even talking about breakage. As I said, I'm totally fine with adding an API that is no-op on linux where things works correctly. That will break embedding API but isn't raising the requirement on embedding users (i.e. need to call more functions but no need to acquire more information than before).

One of the fundamental design patterns I want to avoid is platform differences. We build these platform abstraction libraries precisely so that users don't have to worry about what platform they're running on.

In the end, I don't see requiring passing in an anchoring path as a large burden, and since I don't see a feasible way around it that will still allow us to give the same guarantees of loading precisely the libraries that we want to, it seems like the best solution still.

Also I don't see why libraries linked to libjulia has to have a complicated path. It can be copied/linked to a simpler directory at build time.

This is best summarized in this comment. The benefits to this system are:

All libraries can be serviced by downloading JLLs, and instead of unpacking them into the lib directory of the Julia prefix, they instead get unpacked into an artifacts/<content hash> directory, just like the package manager would do them.
Access of libraries, even the ones that ship with Julia will be possible through JLL APIs
The resolver will know that certain JLLs are already shipped with Julia, allowing us to express proper version constraints upon the libraries included with Julia.

yuyichao · 2020-09-28T18:17:57Z

One of the fundamental design patterns I want to avoid is platform differences. We build these platform abstraction libraries precisely so that users don't have to worry about what platform they're running on.

Yes, that is a good goal, however,

It's an abstraction for the user and it should not matter what platform dependent mechanism is used.
It must not be placed before breaking use cases. That's exactly why I mentioned Julia 1.4 fails on startup (AMD Phenom on Linux) #35215. The series of change there first put "moderness" of the setup over performance and change of compilation environment and the follow up change put performance over breaking code for users. That was a complete priority invertion.

In the end, I don't see requiring passing in an anchoring path as a large burden, and since I don't see a feasible way around it that will still allow us to give the same guarantees of loading precisely the libraries that we want to, it seems like the best solution still.

I did give concrete arguments about this. Since that's what you ask for, please address how each usecase can be solved specifically rather than just say you don't see something as a large burden because it isn't a problem for the julia executable. As I said, the binary may or may not exist and the user may or may not be able to find the correct one. One more thing is that code like

int main()
{
    julia_init(...);
    /* call many jl_* functions */
}

are virtually impossible to port without a very major rewrite where the user has to match how libjulia is loaded, leaking the abstraction. The symbols are not going to be available for the compile time linker anymore.

And since abstracting out the platform is only necessary for the user, using RPATH when it exists is a perfectly good solution.

All libraries can be serviced by downloading JLLs, and instead of unpacking them into the lib directory of the Julia prefix, they instead get unpacked into an artifacts/ directory, just like the package manager would do them.

Whatever unpack them should be able to unpack them into either just fine.
Special case is already needed to generate the list in libjulialoader. That logic can simply be used to generate the additional copying rules.

Access of libraries, even the ones that ship with Julia will be possible through JLL APIs
The resolver will know that certain JLLs are already shipped with Julia, allowing us to express proper version constraints upon the libraries included with Julia.

The same, the JLL package, especially since the library path doesn't have to be constant anymore, does not have to point to the artifact one. Again, it needs the same set of special rules that already exist.

In another word, these are only "benefit" because you want to put the special logic at a particular place. They are in reality not benefit at all when that logic can move. None of these avoid the logic anyway.

More over, the artifact path can even be itself encoded in the RPATH and you don't even have to move the file.

vtjnash · 2022-03-15T20:50:05Z

is this still needed? IIUC, we have done something similar now

tkf · 2022-03-16T03:59:24Z

Yes, it looks like #38160 handles this more automatically.

Add load_libjulia in libjulialoader

da2935d

tkf requested a review from staticfloat September 28, 2020 03:47

yuyichao requested changes Sep 28, 2020

View reviewed changes

staticfloat approved these changes Sep 28, 2020

View reviewed changes

tkf closed this Mar 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add load_libjulia in libjulialoader #37779

Add load_libjulia in libjulialoader #37779

tkf commented Sep 28, 2020

yuyichao left a comment •

edited

Loading

staticfloat left a comment

tkf commented Sep 28, 2020

tkf commented Sep 28, 2020 •

edited

Loading

yuyichao commented Sep 28, 2020

vtjnash commented Sep 28, 2020

yuyichao commented Sep 28, 2020 •

edited

Loading

staticfloat commented Sep 28, 2020

yuyichao commented Sep 28, 2020 •

edited

Loading

staticfloat commented Sep 28, 2020

yuyichao commented Sep 28, 2020 •

edited

Loading

vtjnash commented Mar 15, 2022

tkf commented Mar 16, 2022

Add load_libjulia in libjulialoader #37779

Add load_libjulia in libjulialoader #37779

Conversation

tkf commented Sep 28, 2020

yuyichao left a comment • edited Loading

Choose a reason for hiding this comment

staticfloat left a comment

Choose a reason for hiding this comment

tkf commented Sep 28, 2020

tkf commented Sep 28, 2020 • edited Loading

yuyichao commented Sep 28, 2020

vtjnash commented Sep 28, 2020

yuyichao commented Sep 28, 2020 • edited Loading

staticfloat commented Sep 28, 2020

yuyichao commented Sep 28, 2020 • edited Loading

staticfloat commented Sep 28, 2020

yuyichao commented Sep 28, 2020 • edited Loading

vtjnash commented Mar 15, 2022

tkf commented Mar 16, 2022

yuyichao left a comment •

edited

Loading

tkf commented Sep 28, 2020 •

edited

Loading

yuyichao commented Sep 28, 2020 •

edited

Loading

yuyichao commented Sep 28, 2020 •

edited

Loading

yuyichao commented Sep 28, 2020 •

edited

Loading