Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pointless rpaths on macOS; make sage-env polite when run on systems with no toolchain #37886

Merged
merged 17 commits into from
May 25, 2024

Conversation

culler
Copy link
Contributor

@culler culler commented Apr 27, 2024

This PR does the following (after splitting out the Python upgrade to #37914):

  1. Fix a comment in the Makefile about how to use the DEBUG flag so it will actually work.
  2. Edit sage-env so that it does not print errors to stderr and/or open a dialog whenever it is run on a system which has neither XCode nor command line tools installed.
  3. Remove (almost) all uses of rpath on macOS (see below).
  4. Fix the primecountpy spkg, which was the only one which actually needed an rpath in its python extension module.

About rpaths on macOS
Sage has been misusing rpaths on macOS for a very long time. As far as I can tell, this was an error based on the incorrect assumption that MACH binaries use rpath in the same way as ELF binaries.

The rpath in an ELF binary provides a search path, beyond the default search path, where the loader looks for shared libraries needed by the binary.

A MACH binary specifies the dynamic libraries it needs with load commands of type LC_LOAD_DYLIB. The value of a LC_LOAD_DYLIB load command is a path to a specific library. There is one such load command for each library that the binary needs. The path is an absolute path by default. However, the path is allowed to begin with the substring @rpath. When the binary is loaded that substring is replaced by the value of each LC_RPATH load command in the binary to produce a list of absolute paths (or relative paths if the value of the LC_RPATH begins with @loader_path). Each of the paths on that list is tried as a possible location of the needed dynamic library.

An obvious consequence of this design is that if none of the LC_LOAD_DYLIB paths contains the string @rpath then there is no need for any LC_RPATH commands, since they won't be used. With one exception (primecountpy) every MACH binary produced when building Sage had absolute paths for its LC_LOAD_DYLIB load commands. Nonetheless Sage was setting the rpath to $SAGE_LOCAL/lib in all of these binaries. In fact, it was doing that multiple times, producing many duplicate rpaths all of which were useless. (And duplicate rpaths are now deprecated by Apple's linker.)

This PR removes all of those bogus rpaths.

About xcode-select
The sage-env script was calling gcc to find the names of its linker and archiver in the unlikely case that either of those had a non-standard name, and it was also calling xcode-select as part of a check to see whether the selected XCode toolchain had an ld-classic executable. That executable was added in XCode 15 as a new name for the old linker when a new linker was added. Apple made it possible to use the old linker by passing the linker option -ld_classic to the gcc command. The new linker was buggy in the beginning and was unable to link openblas. So numpy and scipy use the old linker. Sage also wanted to use it. While the new linker is probably working well enough now as a linker, it generates a warning when it encounters duplicate libraries in a link command. It also appears that the new linker, or XCode 15 itself, somehow creates duplicate libraries. While the warnings should be innocuous, they were causing Sage doctest failures. So this PR provides code which adds -Wl,-ld_classic to LDFLAGS in sage-env to replace the analogous code which was already there. But the new code is careful to redirect the error messages to /dev/mull so they do not get printed on the users terminal whenever sage starts up on a machine which does not have any XCode toolchain at all. It also avoids calling gcc in that case since doing so posts a dialog on the user's screen asking if they would like to install command line tools. The dialog is a nice touch, but should only appear if the user is actually using gcc, not as part of every invocation of Sage, especially since XCode uses the standard names.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

Copy link

github-actions bot commented Apr 28, 2024

Documentation preview for this PR (built with commit 3f1894e; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

src/bin/sage-env Outdated Show resolved Hide resolved
src/bin/sage-env Outdated Show resolved Hide resolved
@mkoeppe
Copy link
Contributor

mkoeppe commented Apr 28, 2024

NOTE: the underscore after @ above should be ignored. It was added to avoid a link to a github user page

Note you could also just type `@rpath`

@mkoeppe
Copy link
Contributor

mkoeppe commented Apr 28, 2024

I'm skeptical that only primecountpy would need the rpath on macOS, but I haven't read the explanation in the PR description very carefully yet.

@culler
Copy link
Contributor Author

culler commented Apr 28, 2024

You are wise to be skeptical, but that was the only one that turned up when I ran the tests. Perhaps there are optional spkgs that need it. I could search for those, since the macOS app includes all optional spkgs that I have been able to build, including some that I needed to patch. But that would take some work.

As explained in the PR, normally an absolute path is used in the LC_LOAD_DYLIB command. I think primecountpy is going out of its way to use an rpath instead.

@culler
Copy link
Contributor Author

culler commented Apr 28, 2024

Any package which has been run through delocate will use @_rpath. But I don't think that those packages would be influenced by sage-env.

@mkoeppe
Copy link
Contributor

mkoeppe commented Apr 28, 2024

I don't think there's any special code in primecountpy - https://github.com/dimpase/primecountpy/blob/master/setup.py
If there's anything special, it would have to come from primecount, which is built using cmake.

(Also note we don't use delocate in our build process. It only makes an appearance in the tests for meson-python, and in make list-broken-packages.)

@culler
Copy link
Contributor Author

culler commented Apr 28, 2024

Yes, you are right. I found another one: libsymengine.

@mkoeppe
Copy link
Contributor

mkoeppe commented Apr 28, 2024

Would it be OK with you if I cherry-pick the python3 upgrade (+ patch removal) onto a new PR?

@culler
Copy link
Contributor Author

culler commented Apr 28, 2024

That was easier than I thought. It looks like there are 4 others:
(These were found by searching through the 10.3 build that I used for the macOS app -- I will redo it for 10.4 and see what turns up.)

grep '@rpath' `find . -name '*.so'`
Binary file ./site-packages/symengine/lib/symengine_wrapper.cpython-311-darwin.so matches
Binary file ./site-packages/rpds/rpds.cpython-311-darwin.so matches
Binary file ./site-packages/primecountpy/primecount.cpython-311-darwin.so matches
Binary file ./site-packages/sage/libs/ecl.cpython-311-darwin.so matches
Binary file ./site-packages/sage/graphs/bliss.cpython-311-darwin.so matches

I'm surprised that I didn't see test failures for those, but who knows?

By the way, ecl.cpython-311-darwin.so provides a good example of how sage has been doing things in the past:

LC_LOAD_DYLIB: @rpath/libecl.23.9.dylib
LC_LOAD_DYLIB: /private/var/tmp/sage-10.3-current/local/lib/libgmp.10.dylib
LC_LOAD_DYLIB: /usr/lib/libSystem.B.dylib
LC_LOAD_DYLIB: /private/var/tmp/sage-10.3-current/local/lib/libgc.1.dylib
LC_LOAD_DYLIB: /private/var/tmp/sage-10.3-current/local/lib/libpari-gmp-tls.dylib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/var/lib/sage/venv-python3.11.8/lib
LC_RPATH: .
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib
LC_RPATH: /private/var/tmp/sage-10.3-current/local/lib

It has one ridiculous rpath which is '.' (meaningless to the loader) and 7 copies of the one rpath it needs (remember duplicate rpaths are now deprecated by Apple) plus one other irrelevant rpath.

@culler
Copy link
Contributor Author

culler commented Apr 28, 2024

In my build of 10.4.beta4 based on this PR (with no optional packages) the only python extension modules which legitimately use @rpath are ecl.cpython-311-darwin.so primecount.cpython-311-darwin.so. But I should also fix symengine_py by adding the appropriate rpath options to its LDFLAGS as I did with primecountpy.

The rpds package was a red herring. It only uses @rpath in its LC_ID_DYLIB load command, which is not used by the loader and makes no difference anyway. Also, rpds is only installed as a dependency of jupyter and hence does not need Sage's LDFLAGS.

@culler
Copy link
Contributor Author

culler commented Apr 28, 2024

Would it be OK with you if I cherry-pick the python3 upgrade (+ patch removal) onto a new PR?

Sure. Have at it.

@culler
Copy link
Contributor Author

culler commented Apr 29, 2024

I found out why the ecl python extension did not cause tests to fail even though it has a LC_LOAD_DYLIB load command containing @rpath. The reason is that it has the correct LC_RPATH. In fact, it has two of them.:

LC_LOAD_DYLIB: @rpath/libecl.23.9.dylib
LC_LOAD_DYLIB: /Users/culler/sage_pr/sage/local/lib/libgc.1.dylib
LC_LOAD_DYLIB: /usr/lib/libSystem.B.dylib
LC_LOAD_DYLIB: /Users/culler/sage_pr/sage/local/lib/libgmp.10.dylib
LC_LOAD_DYLIB: /Users/culler/sage_pr/sage/local/lib/libpari-gmp-tls.dylib
LC_RPATH: /Users/culler/sage_pr/sage/local/lib
LC_RPATH: /Users/culler/sage_pr/sage/local/lib

That might help explain why it had the record-setting 7 identical rpaths before this PR. I don't know where those two come from, but I guess it is not a problem for this PR. It may become a problem when duplicate rpaths become errors, instead of producing deprecation warnings.

So that leave just the two optional packages symengine_py and bliss that will need to have rpaths set as wias already done with primecountpy.

@culler
Copy link
Contributor Author

culler commented Apr 29, 2024

I have fixed the bliss and symengine packages by adding rpath commands to LDFLAGS. Both packages load without errors.

I think this PR is now ready to go.

@dimpase
Copy link
Member

dimpase commented Apr 29, 2024

Are you saying that primecountpy needed a fix of some sort? IMHO it works without this addition.

@culler
Copy link
Contributor Author

culler commented Apr 29, 2024 via email

@dimpase
Copy link
Member

dimpase commented Apr 29, 2024

this is strange. primecountpy is a straight PyPI package. No woodoo is needed, just pip-install it.

I am under impression that anything path-specific for primecountpy is done at its build-time, and is all baked in there.

@culler
Copy link
Contributor Author

culler commented Apr 29, 2024 via email

if [ "$UNAME" = "Darwin" ] && [ -z "$LD" ] && [ -x "$(xcode-select -p)/usr/bin/ld-classic" -o -x "$(xcode-select -p)/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld-classic" ] ; then
LDFLAGS="-L$SAGE_LOCAL/lib -Wl,-ld_classic,-rpath,$SAGE_LOCAL/lib $LDFLAGS"
else
LDFLAGS="-L$SAGE_LOCAL/lib -Wl,-rpath,$SAGE_LOCAL/lib $LDFLAGS"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replacing -rpath...

Copy link
Contributor Author

@culler culler May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained above, the rpath option is different for MACH binaries than for ELF binaries and there is no purpose in setting an rpath for a MACH binary if it does not have the @rpath string in any of its load paths. This option causes Sage to build MACH binaries with multiple identical unused LC_RPATH load commands. And duplicate LC_RPATH load commands are deprecated by Apple's linker.

Would you please explain why you are adding that rpath option back?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading it right, the change at #37982 proposes restoring -rpath in the Linux case, not the Darwin case: essentially restoring line 386. Better to review #37982 than continue the discussion here, though.

Copy link
Contributor Author

@culler culler May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I thought the suggestion proposed removing the entire red passage, which leaves the current rpaths for linux unchanged but removes the rpaths for Darwin.

I didn't really understand how this PR, which has not been merged, could have caused breakage that another PR would need to correct. And certainly this PR, which makes no changes to linux, could not cause a linux breakage even if it had been merged. Maybe the cross-reference to this PR was a typo.

In any case, I will be happy as long as the global rpath option is removed for macOS (and macOS only) and replaced by adding local rpath options in the few packages that actually need them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I see this on GitHub, I see vbraun, as part of his review process, quoting a change that you made. He is not proposing any changes here. It looks like your original change to sage-env removed an entire "if ... else" block (the old lines 383-386), and he is saying that the "else" part needs to be kept when using Linux: your changes remove these lines:

    # However, if LD is set explicitly, as it is within conda on macOS,
    # do not not do this.
    if [ "$UNAME" = "Darwin" ] && [ -z "$LD" ] && [ -x "$(xcode-select -p)/usr/bin/ld-classic" -o -x "$(xcode-select -p)/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld-classic" ] ; then
        LDFLAGS="-L$SAGE_LOCAL/lib -Wl,-ld_classic,-rpath,$SAGE_LOCAL/lib $LDFLAGS"
    else
        LDFLAGS="-L$SAGE_LOCAL/lib -Wl,-rpath,$SAGE_LOCAL/lib $LDFLAGS"

and he essentially wants to restore the last line when using Linux.

As far as how it could cause breakage, my guess is that since it had a positive review, he tried to merge it for the next release and saw that broke things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, John. That makes sense. I am certainly capable of having made a mistake like that. I will look for the mistake and commit a fixed version of sage-env.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that I definitely did screw up the logic in sage-env. I committed a fix in which I attempted to separate the linux and Darwin LDFLAGS constructions extremely clearly, so the next person who looks at that file won't get confused the way that I did. I also found that an rpath was needed in the primecount spkg in addition to the one which had already added to primecountpy.

I then built the PR on Ubuntu 22.04 and macOS 14.1.2. The linux build finished cleanly. The macOS build almost worked but failed near the end on the fflas_ffpack-2.5.0 spkg with a linker error reporting one undefined symbol:

[spkg-install] Undefined symbols for architecture x86_64:
[spkg-install]   "Givaro::Integer::operator%(unsigned long long) const", referenced from:
[spkg-install]       Givaro::Modular<double, double, void>::init(double&, Givaro::Integer const&) const in fflas_lvl1.o
[spkg-install]       Givaro::Modular<double, double, void>::init(double&, Givaro::Integer const&) const in fflas_lvl2.o
[spkg-install]       Givaro::Modular<double, double, void>::init(double&, Givaro::Integer const&) const in fflas_lvl3.o

I don't see how that error could be related to this PR, so I think it is ready to go (again).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing givaro symbols is #38002

vbraun pushed a commit to vbraun/sage that referenced this pull request May 12, 2024
…in patches, remove pointless rpaths on macOS

    
<!-- ^ Please provide a concise and informative title. -->
<!-- ^ Don't put issue numbers in the title, do this in the PR
description below. -->
<!-- ^ For example, instead of "Fixes sagemath#12345" use "Introduce new method
to calculate 1 + 2". -->
<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->

Author: @culler

Split out from:
- sagemath#37886

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37914
Reported by: Matthias Köppe
Reviewer(s):
vbraun pushed a commit to vbraun/sage that referenced this pull request May 12, 2024
…in patches, remove pointless rpaths on macOS

    
<!-- ^ Please provide a concise and informative title. -->
<!-- ^ Don't put issue numbers in the title, do this in the PR
description below. -->
<!-- ^ For example, instead of "Fixes sagemath#12345" use "Introduce new method
to calculate 1 + 2". -->
<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->

Author: @culler

Split out from:
- sagemath#37886

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37914
Reported by: Matthias Köppe
Reviewer(s):
@mkoeppe
Copy link
Contributor

mkoeppe commented May 12, 2024

@culler Rebased as requested.

@mkoeppe mkoeppe changed the title Update python; remove pointless rpaths on macOS; make sage-env polite when run on systems with no toolchain Remove pointless rpaths on macOS; make sage-env polite when run on systems with no toolchain May 12, 2024
@culler
Copy link
Contributor Author

culler commented May 12, 2024

@culler Rebased as requested.

Thanks!!!

@culler
Copy link
Contributor Author

culler commented May 13, 2024

Very similar linker errors with fflas_ffpack have been independently reported for 10.4.beta6 on sage-release, confirming that those failures are unrelated to this PR.

Since I have done the needed work, and tested on both linux and macOS, I am going to remove the "needs work" label and restore the "positive review" label so this change can eventually make it into 10.4. If I am not allowed to do that, feel free to undo my changes to the labels.

vbraun pushed a commit to vbraun/sage that referenced this pull request May 15, 2024
…ite when run on systems with no toolchain

    
<!-- ^ Please provide a concise and informative title. -->

This PR does the following (after splitting out the Python upgrade to
[sagemath#37914](sagemath#37914)):
1. Fix a comment in the Makefile about how to use the DEBUG flag so it
will actually work.
3. Edit sage-env so that it does not print errors to stderr and/or open
a dialog whenever it is run on a system which has neither XCode nor
command line tools installed.
4. Remove (almost) all uses of rpath on macOS (see below).
5. Fix the primecountpy spkg, which was the only one which actually
needed an rpath in its python extension module.

**About rpaths on macOS**
Sage has been misusing rpaths on macOS for a very long time.  As far as
I can tell, this was an error based on the incorrect assumption that
MACH binaries use rpath in the same way as ELF binaries.

The rpath in an ELF binary provides a search path, beyond the default
search path, where the loader looks for shared libraries needed by the
binary.

A MACH binary specifies the dynamic libraries it needs with load
commands of type LC_LOAD_DYLIB.  The value of a LC_LOAD_DYLIB load
command is a path to a specific library.  There is one such load command
for each library that the binary needs.  The path is an absolute path by
default.  However, the path is allowed to begin with the substring
`@rpath`.  When the binary is loaded that substring is replaced by the
value of each LC_RPATH load command in the binary to produce a list of
absolute paths (or relative paths if the value of the LC_RPATH begins
with `@loader_path`).  Each of the paths on that list is tried as a
possible location of the needed dynamic library.

An obvious consequence of this design is that if none of the
LC_LOAD_DYLIB paths contains the string `@rpath` then there is no need
for any LC_RPATH commands, since they won't be used.  With one exception
(primecountpy) every MACH binary produced when building Sage had
absolute paths for its LC_LOAD_DYLIB load commands.  Nonetheless Sage
was setting the rpath to $SAGE_LOCAL/lib in all of these binaries.  In
fact, it was doing that multiple times, producing many duplicate rpaths
all of which were useless.  (And duplicate rpaths are now deprecated by
Apple's linker.)

This PR removes all of those bogus rpaths.

**About xcode-select**
The sage-env script was calling gcc to find the names of its linker and
archiver in the unlikely case that either of those had a non-standard
name, and it was also calling xcode-select as part of a check to see
whether the selected XCode toolchain had an ld-classic executable.  That
executable was added in XCode 15 as a new name for the old linker when a
new linker was added.  Apple made it possible to use the old linker by
passing the linker option -ld_classic to the gcc command.  The new
linker was buggy in the beginning and was unable to link openblas.  So
numpy and scipy use the old linker.  Sage also wanted to use it.  While
the new linker is probably working well enough now as a linker, it
generates a warning when it encounters duplicate libraries in a link
command.  It also appears that the new linker, or XCode 15 itself,
somehow *creates* duplicate libraries.  While the warnings should be
innocuous, they were causing Sage doctest failures.  So this PR provides
code which adds -Wl,-ld_classic to LDFLAGS in sage-env to replace the
analogous code which was already there.  But the new code is careful to
redirect the error messages to /dev/mull so they do not get printed on
the users terminal whenever sage starts up on a machine which does not
have any XCode toolchain at all.  It also avoids calling gcc in that
case since doing so posts a dialog on the user's screen asking if they
would like to install command line tools.  The dialog is a nice touch,
but should only appear if the user is actually *using* gcc, not as part
of every invocation of Sage, especially since XCode uses the standard
names.

<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->



### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [ ] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [ ] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37886
Reported by: Marc Culler
Reviewer(s): John H. Palmieri, Marc Culler, Matthias Köppe, Volker Braun
vbraun pushed a commit to vbraun/sage that referenced this pull request May 18, 2024
…ite when run on systems with no toolchain

    
<!-- ^ Please provide a concise and informative title. -->

This PR does the following (after splitting out the Python upgrade to
[sagemath#37914](sagemath#37914)):
1. Fix a comment in the Makefile about how to use the DEBUG flag so it
will actually work.
3. Edit sage-env so that it does not print errors to stderr and/or open
a dialog whenever it is run on a system which has neither XCode nor
command line tools installed.
4. Remove (almost) all uses of rpath on macOS (see below).
5. Fix the primecountpy spkg, which was the only one which actually
needed an rpath in its python extension module.

**About rpaths on macOS**
Sage has been misusing rpaths on macOS for a very long time.  As far as
I can tell, this was an error based on the incorrect assumption that
MACH binaries use rpath in the same way as ELF binaries.

The rpath in an ELF binary provides a search path, beyond the default
search path, where the loader looks for shared libraries needed by the
binary.

A MACH binary specifies the dynamic libraries it needs with load
commands of type LC_LOAD_DYLIB.  The value of a LC_LOAD_DYLIB load
command is a path to a specific library.  There is one such load command
for each library that the binary needs.  The path is an absolute path by
default.  However, the path is allowed to begin with the substring
`@rpath`.  When the binary is loaded that substring is replaced by the
value of each LC_RPATH load command in the binary to produce a list of
absolute paths (or relative paths if the value of the LC_RPATH begins
with `@loader_path`).  Each of the paths on that list is tried as a
possible location of the needed dynamic library.

An obvious consequence of this design is that if none of the
LC_LOAD_DYLIB paths contains the string `@rpath` then there is no need
for any LC_RPATH commands, since they won't be used.  With one exception
(primecountpy) every MACH binary produced when building Sage had
absolute paths for its LC_LOAD_DYLIB load commands.  Nonetheless Sage
was setting the rpath to $SAGE_LOCAL/lib in all of these binaries.  In
fact, it was doing that multiple times, producing many duplicate rpaths
all of which were useless.  (And duplicate rpaths are now deprecated by
Apple's linker.)

This PR removes all of those bogus rpaths.

**About xcode-select**
The sage-env script was calling gcc to find the names of its linker and
archiver in the unlikely case that either of those had a non-standard
name, and it was also calling xcode-select as part of a check to see
whether the selected XCode toolchain had an ld-classic executable.  That
executable was added in XCode 15 as a new name for the old linker when a
new linker was added.  Apple made it possible to use the old linker by
passing the linker option -ld_classic to the gcc command.  The new
linker was buggy in the beginning and was unable to link openblas.  So
numpy and scipy use the old linker.  Sage also wanted to use it.  While
the new linker is probably working well enough now as a linker, it
generates a warning when it encounters duplicate libraries in a link
command.  It also appears that the new linker, or XCode 15 itself,
somehow *creates* duplicate libraries.  While the warnings should be
innocuous, they were causing Sage doctest failures.  So this PR provides
code which adds -Wl,-ld_classic to LDFLAGS in sage-env to replace the
analogous code which was already there.  But the new code is careful to
redirect the error messages to /dev/mull so they do not get printed on
the users terminal whenever sage starts up on a machine which does not
have any XCode toolchain at all.  It also avoids calling gcc in that
case since doing so posts a dialog on the user's screen asking if they
would like to install command line tools.  The dialog is a nice touch,
but should only appear if the user is actually *using* gcc, not as part
of every invocation of Sage, especially since XCode uses the standard
names.

<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->



### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [ ] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [ ] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37886
Reported by: Marc Culler
Reviewer(s): John H. Palmieri, Marc Culler, Matthias Köppe, Volker Braun
vbraun pushed a commit to vbraun/sage that referenced this pull request May 18, 2024
…ite when run on systems with no toolchain

    
<!-- ^ Please provide a concise and informative title. -->

This PR does the following (after splitting out the Python upgrade to
[sagemath#37914](sagemath#37914)):
1. Fix a comment in the Makefile about how to use the DEBUG flag so it
will actually work.
3. Edit sage-env so that it does not print errors to stderr and/or open
a dialog whenever it is run on a system which has neither XCode nor
command line tools installed.
4. Remove (almost) all uses of rpath on macOS (see below).
5. Fix the primecountpy spkg, which was the only one which actually
needed an rpath in its python extension module.

**About rpaths on macOS**
Sage has been misusing rpaths on macOS for a very long time.  As far as
I can tell, this was an error based on the incorrect assumption that
MACH binaries use rpath in the same way as ELF binaries.

The rpath in an ELF binary provides a search path, beyond the default
search path, where the loader looks for shared libraries needed by the
binary.

A MACH binary specifies the dynamic libraries it needs with load
commands of type LC_LOAD_DYLIB.  The value of a LC_LOAD_DYLIB load
command is a path to a specific library.  There is one such load command
for each library that the binary needs.  The path is an absolute path by
default.  However, the path is allowed to begin with the substring
`@rpath`.  When the binary is loaded that substring is replaced by the
value of each LC_RPATH load command in the binary to produce a list of
absolute paths (or relative paths if the value of the LC_RPATH begins
with `@loader_path`).  Each of the paths on that list is tried as a
possible location of the needed dynamic library.

An obvious consequence of this design is that if none of the
LC_LOAD_DYLIB paths contains the string `@rpath` then there is no need
for any LC_RPATH commands, since they won't be used.  With one exception
(primecountpy) every MACH binary produced when building Sage had
absolute paths for its LC_LOAD_DYLIB load commands.  Nonetheless Sage
was setting the rpath to $SAGE_LOCAL/lib in all of these binaries.  In
fact, it was doing that multiple times, producing many duplicate rpaths
all of which were useless.  (And duplicate rpaths are now deprecated by
Apple's linker.)

This PR removes all of those bogus rpaths.

**About xcode-select**
The sage-env script was calling gcc to find the names of its linker and
archiver in the unlikely case that either of those had a non-standard
name, and it was also calling xcode-select as part of a check to see
whether the selected XCode toolchain had an ld-classic executable.  That
executable was added in XCode 15 as a new name for the old linker when a
new linker was added.  Apple made it possible to use the old linker by
passing the linker option -ld_classic to the gcc command.  The new
linker was buggy in the beginning and was unable to link openblas.  So
numpy and scipy use the old linker.  Sage also wanted to use it.  While
the new linker is probably working well enough now as a linker, it
generates a warning when it encounters duplicate libraries in a link
command.  It also appears that the new linker, or XCode 15 itself,
somehow *creates* duplicate libraries.  While the warnings should be
innocuous, they were causing Sage doctest failures.  So this PR provides
code which adds -Wl,-ld_classic to LDFLAGS in sage-env to replace the
analogous code which was already there.  But the new code is careful to
redirect the error messages to /dev/mull so they do not get printed on
the users terminal whenever sage starts up on a machine which does not
have any XCode toolchain at all.  It also avoids calling gcc in that
case since doing so posts a dialog on the user's screen asking if they
would like to install command line tools.  The dialog is a nice touch,
but should only appear if the user is actually *using* gcc, not as part
of every invocation of Sage, especially since XCode uses the standard
names.

<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->



### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [ ] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [ ] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37886
Reported by: Marc Culler
Reviewer(s): John H. Palmieri, Marc Culler, Matthias Köppe, Volker Braun
@vbraun vbraun merged commit c3719ba into sagemath:develop May 25, 2024
15 of 37 checks passed
@mkoeppe mkoeppe added this to the sage-10.4 milestone May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants