-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better diagnostic for native .dll loading errors (pass through dlerror) #7737
Comments
i think dll import looks for .so or dylab in the os default location. for example i did not have to provide anything but just the name of libxcb for xwindows on linux and for cairo on linux. the .so libs are in the their default location where the package manager installs by default. |
I believe the code paths involved in loading your dylib should just be using If that doesn't help, please provide a minimal sample and we can help you figure out what's going wrong. |
@adityamandaleeka I looked at the documentation (thanks for that) but I couldn't make it work. According to the documentation, it will look in the current directory or an absolute path. I have tried both but none work. Please find the sample here: https://github.com/alexanderuv/LLVMWrap |
CC @yizhang82 |
@alexanderuv I tried out your sample and I believe the reason it's not able to load libLLVM.dylib is that libLLVM.dylib is also referencing libLTO.dylib, which isn't found. |
This load failure isn't a CoreCLR bug, but it would be great if we could provide a good way for users to be able to diagnose this type of failure without having to debug or use external tools. The error message you saw ( |
Great! Thanks for the help! |
Today you either have to monitor file system access by the program, or have a debug build of CoreCLR with |
I think I have suggested a long time ago to change the error message to "The specified module or one of its dependencies could not be found." That way it would hint the user to verify that the reported .dylib / .so has all the dependencies it needs. On Linux, "ldd -r" is the best way. On OSX, the "otool -L" command should work. |
I don't think the runtime should add OS specific diagnose support for dll loading. How about adding a fwlink that points to a doc that shows the tips/techniques/tools to diagnose such issues? The tools mentioned here (procmon, windbg+loader snap, ldd, otool) is a good start point. We can easily append that in the error message. |
@janvorli I think that would be much more helpful than what it says right now. Also that is not platform specific so if that's as far as one can go, it would be fine |
Hi, I think that if you can get the error message, you can add it. |
We encountered
When running a program on a newly mounted disk. The application worked fine on the main disk, but it would fail as above when the application was copied to the mounted disk. Using
From reading the ‘man mmap’ pages:
Which was the problem - the disk was mounted no-exec. Remounting it with the exec option allowed the application to start running again. Better diagnostics info in the exception would have helped speed up this investigation and narrow the problem sooner. |
@jkotas @janvorli short of literally running tools like strace, ldd etc for the user is there more information available to us that we can provide? @adityamandaleeka mentions WRN messages in debug builds -- are these the ones he has in mind? If so could we usefully include the
|
Perhaps there are other places we could pass
|
Yes. |
Looks like it would need a little wiring as currently errors are passed back through GetLastError, ie DWORD
It might also be useful to have LoadLibErrorTracker include all errors, not just what it thinks is the highest priority |
I guess I add something like this that wraps dlerror() PALIMPORT
BOOL
PALAPI
PAL_GetLoadLibraryError(
OUT LPWSTR lpBuffer,
IN DWORD nSize
) then have this code call it instead of GetLastError() HMODULE hmod = PAL_LoadLibraryDirect(name);
if (hmod == NULL)
{
pErrorTracker->TrackErrorCode(GetLastError());
} then update LoadLibErrorTracker to accept and throw a string. |
@danmosemsft I'm trying to get a handle on what needs to be done here - were you seeing that it would be sufficient to create the |
PAL_GetLoadLibraryError can just go in module.cpp or somewhere relevant. My reading of the doc for dlerror(), https://linux.die.net/man/3/dlerror , calling dlerror() clears it out and also subsequent successful use of dlopen() does not clear it out. So what I suggest is storing a string in the PAL containing the last dlerror. There would be an internal function like ClearLoadLibraryError() to clear it. PAL_GetLoadLibraryError() would return any existing value, or if it was empty, call dlerror() to get and store the value then return that. Then modifying all places in PAL that call dlopen(), dlsym() or dlclose(), to add ClearLoadLibraryError() before the call. Then change If TrackErrorCode() called GetLoadLibraryError() then it should should store the string from that and put that string into the DllNotFoundException when it throws it. @jkotas does this sound like a reasonable plan? Some future change could possibly do something similar for EntrypointNotFoundException to include any error from dlsym. But this bug is about DllNotFoundException. |
Why wouldn't PAL_GetLoadLibraryError just call dlerror()? We do not need to be in the business of caching it.
This just needs to be Otherwise sounds good to me. |
To be consistent with GetLastError, which you can call repeatedly. The man page says that dlerror() resets when called. I'm happy to not cache it though. |
@danmosemsft at https://github.com/dotnet/coreclr/blob/master/src/vm/dllimport.cpp#L5730, if we replace dwLastError with the result of PAL_GetLoadLibraryError on non-windows, then won't that switch statement not work? I.E. it's only set up to understand an int result. Or should we call GetLastError on every OS, but only (additionally) call PAL_GetLoadLibraryError on non-windows? |
@wtgodbe in the Unix path, I do not think you need this "priority" scheme in that class. I would just simply store the string and return. When you are done with the work you could verify that TrackErrorCode() is not getting called more than once. |
@danmosemsft what do you suggest I do about https://github.com/dotnet/coreclr/blob/master/src/vm/dllimport.cpp#L5819-L5823 ? I could add a priority value for ERROR_INVALID_PARAMETER? Or should I assume that case falls under https://github.com/dotnet/coreclr/blob/master/src/vm/dllimport.cpp#L5745-L5748? |
Note it's != ERROR_INVALID_PARAMETER, so that will not call TrackErrorCode..? If your question is about the Linux case, I think we're assuming that priority isn't relevant there and there will be just one call to TRackErrorCode. |
I meant that we need to eliminate calls to GetLastError() outside of TrackErrorCode(), so the call at https://github.com/dotnet/coreclr/blob/master/src/vm/dllimport.cpp#L5819 needs to be removed, meaning we won't have the value of dwLastError to check against ERROR_INVALID_PARAMETER |
Ah right. It should be OK to call GetLastError() both there and again inside TrackErrorCode(). It should not reset. |
Resolved by dotnet/coreclr#15831
|
Hi
I am trying to write a .NET Core wrapper for the LLVM C Frontend. However I am having trouble with the runtime finding the dylib.
I've tried the project paths, the bin/debug/target/ path and nothing. Any clues about what I might be missing? or is this a bug?
The text was updated successfully, but these errors were encountered: