Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not require AVX when building with SAGE_FAT_BINARY #32434

Open
vbraun opened this issue Aug 29, 2021 · 24 comments
Open

Do not require AVX when building with SAGE_FAT_BINARY #32434

vbraun opened this issue Aug 29, 2021 · 24 comments

Comments

@vbraun
Copy link
Member

vbraun commented Aug 29, 2021

Setting CFLAGS to -mno-avx -mno-avx2 -mno-bmi2 should help the produced binaries work on older processors.

Move from sagemath/binary-pkg#31

CC: @culler @dimpase @embray @kiwifb @kliem @orlitzky @mkoeppe @slel @kwankyu

Component: build

Author: Samuel Lelièvre

Branch/Commit: u/soehms/adapt_fat_binary_32434 @ 3c61bf2

Reviewer: Sebastian Oehms, ...

Issue created by migration from https://trac.sagemath.org/ticket/32434

@vbraun vbraun added this to the sage-9.5 milestone Aug 29, 2021
@slel
Copy link
Member

slel commented Aug 31, 2021

comment:1

Thanks for opening this ticket. What files need changing?

Ran git grep SAGE_FAT_BINARY
and git grep fat-binary; still clueless.

Probably a subset of these files:

$ git grep SAGE_FAT_BINARY | cut -f 1 -d : | sort | uniq
build/bin/sage-build-env-config.in
build/pkgs/argon2_cffi/spkg-install.in
build/pkgs/atlas/configuration.py
build/pkgs/atlas/patches/atlas-config
build/pkgs/curl/spkg-install.in
build/pkgs/ecm/SPKG.rst
build/pkgs/ecm/spkg-install.in
build/pkgs/fflas_ffpack/spkg-install.in
build/pkgs/gcc/spkg-configure.m4
build/pkgs/gf2x/spkg-install.in
build/pkgs/givaro/spkg-install.in
build/pkgs/gmp/spkg-install.in
build/pkgs/libffi/spkg-install.in
build/pkgs/linbox/spkg-install.in
build/pkgs/m4ri/spkg-install.in
build/pkgs/mpir/spkg-install.in
build/pkgs/ntl/spkg-install.in
build/pkgs/numpy/spkg-install.in
build/pkgs/openblas/spkg-install.in
build/pkgs/primecount/spkg-install.in
build/pkgs/r/spkg-install.in
configure.ac
docker/Dockerfile
src/doc/en/installation/source.rst
src/sage_setup/command/sage_build_ext.py

@slel

This comment has been minimized.

@orlitzky
Copy link
Contributor

comment:2

Warning: we shouldn't add anything to CFLAGS without first checking that the user's compiler supports it.

IMO a better long-term solution is to ensure that all of our SPKGs respect the user's CFLAGS, CXXFLAGS, et cetera, and don't add unnecessary flags to them on-the-fly. All Gentoo packages do that, so you can likely steal patches/workarounds from our package repository or from the sage-on-gentoo overlay.

@slel
Copy link
Member

slel commented Aug 31, 2021

comment:3

Would this change in src/sage_setup/command/sage_build_ext.py
help in any way?

 def check_flags(self):
     """
     Sanity check the compiler flags used to build the extensions
     """
     forbidden = None
     if os.environ.get("SAGE_FAT_BINARY") == "yes":
         # When building with SAGE_FAT_BINARY=yes, we should not
         # enable CPU features which do not exist on every CPU.
         # Such flags usually come from other libraries adding the
         # flags to the pkgconfig configuration.  So if you hit these
         # errors, the problem is most likely with some external
         # library and not with Sage.
         import re
-        forbidden = re.compile(r"-march=|-mpcu=|-msse3|-msse4|-mpopcnt|-mavx")
+        forbidden = re.compile(r"-march=|-mpcu=|-msse3|-msse4|-mpopcnt"
+                               r"|-mavx|-mavx2|-mbmi2")

     if forbidden is not None:
         errors = 0
         for ext in self.extensions:
             flags = ext.extra_compile_args
             for flag in flags:
                 if forbidden.match(flag):
                     log.error("%s uses forbidden flag '%s'", ext.name, flag)
                     errors += 1
         if errors:
             raise RuntimeError("forbidden flags used")

@orlitzky
Copy link
Contributor

orlitzky commented Sep 1, 2021

comment:4

Replying to @slel:

Would this change in src/sage_setup/command/sage_build_ext.py
help in any way?

I think it strips those flags when building the sage library (which is the right thing to do in any case), but probably doesn't solve the issue in the original bug report because it's most likely coming from some SPKG and not the sage library itself.

OpenBLAS is a prime suspect, since 0.3.13 will use certain machine-specific instructions when no TARGET is specified: OpenMathLib/OpenBLAS#3056

On sage-support, Nathan said that the mac app has to set TARGET=CORE2 to support macs from 2013. I think to fix the same problem, we need to decide how old of a computer should be able to run the sage binaries, and set a TARGET appropriately when SAGE_FAT_BINARY is used. (Of course, this is all based on the guess that OpenBLAS is to blame. Can anyone reproduce the problem?) If no one else has an opinion, TARGET=CORE2 sounds fine to me. Building from source is always an option.

@fchapoton
Copy link
Contributor

comment:5

bump to 9.6

@fchapoton fchapoton modified the milestones: sage-9.5, sage-9.6 Jan 29, 2022
@mkoeppe mkoeppe modified the milestones: sage-9.6, sage-9.7 May 3, 2022
@williamstein
Copy link
Contributor

comment:7

I would just like this ticket to get prioritized more highly. It does impact users and is probably easy to fix. That's all.

@mkoeppe
Copy link
Contributor

mkoeppe commented May 17, 2022

comment:8

Won't happen without people who need SAGE_FAT_BINARY doing the work and setting up tests.

@williamstein
Copy link
Contributor

comment:9

It occurs to me that suggestions like use "CFLAGS to -mno-avx -mno-avx2 -mno-bmi2" aren't really in the spirit of "SAGE_FAT_BINARY". They are in the spirit of "make a binary that is very slow but works on a maximum amount of hardware". Maybe we can't have it both ways, but simply disabling AVX, etc. is probably going to result in a much slower Sage binary.

@mkoeppe
Copy link
Contributor

mkoeppe commented May 17, 2022

comment:10

See #32363 "Rename configure option --enable-fat-binary to --enable-portable-binary"

@mkoeppe
Copy link
Contributor

mkoeppe commented May 17, 2022

comment:11

openblas has dynamic_arch, numpy has --cpu-baseline. I'm not aware of a more general mechanism to do magic runtime specialization to arch variants.

@mkoeppe
Copy link
Contributor

mkoeppe commented May 17, 2022

comment:12

Short of compiler/linker support for such a thing, I'd think the best that a binary vendor could do is to make several builds with separate --prefix (SAGE_LOCAL) and then dispatch at runtime. Comes at a cost of longer build times and a few gigabytes of space.

@mkoeppe
Copy link
Contributor

mkoeppe commented May 17, 2022

comment:13

Said vendor could also throw in a "debug" build while at it.

@soehms
Copy link
Member

soehms commented May 30, 2022

comment:14

Replying to @orlitzky:

Replying to @slel:

Would this change in src/sage_setup/command/sage_build_ext.py
help in any way?

I think it strips those flags when building the sage library (which is the right thing to do in any case), but probably doesn't solve the issue in the original bug report because it's most likely coming from some SPKG and not the sage library itself.

I can't say anything about that, but I confirm that Samuel's suggestion fixes an at least related issue that I had on the device I described in this sage-devel post (output of lshw):

     *-cpu
          Beschreibung: CPU
          Produkt: Intel(R) Atom(TM) CPU N455   @ 1.66GHz
          Hersteller: Intel Corp.
          Physische ID: 1d
          Bus-Informationen: cpu@0
          Version: 6.12.10
          Seriennummer: 0001-06CA-0000-0000-0000-0000
          Steckplatz: CPU
          Größe: 1662MHz
          Kapazität: 1666MHz
          Breite: 64 bits
          Takt: 667MHz
          Fähigkeiten: x86-64 fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon pebs bts cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm dtherm cpufreq
          Konfiguration: cores=1 enabledcores=1 id=0 threads=2

Here my observation:

  1. I still could build one of the first beta of 9.6 (beta3 IIRC) on this computer without giving any options to configure.
  2. The build of 9.7.beta0 breaks with SIGILL in sagemath_doc_html-none and on startup even after setting --enable-fat-binary.
  3. The build of 9.7.beta0 runs without any problems after doing the changes suggested by Samuel.

So, my opinion is that we should do this fix since it is surely an improvement. If it really would not help for the original issue then that could be a follow up ticket (or the other way around).

@soehms
Copy link
Member

soehms commented Jun 8, 2022

Branch: u/soehms/adapt_fat_binary_32434

@soehms
Copy link
Member

soehms commented Jun 8, 2022

comment:16

According to my previous comment I push a branch containing Samuel's suggestion.


New commits:

3c61bf232434: initial

@soehms
Copy link
Member

soehms commented Jun 8, 2022

Reviewer: Sebastian Oehms, ...

@soehms
Copy link
Member

soehms commented Jun 8, 2022

Author: Samuel Lelievre

@soehms
Copy link
Member

soehms commented Jun 8, 2022

Commit: 3c61bf2

@soehms
Copy link
Member

soehms commented Jun 8, 2022

Changed author from Samuel Lelievre to Samuel Lelièvre

@mkoeppe
Copy link
Contributor

mkoeppe commented Aug 18, 2022

comment:18

Another affected CPU: see #33671 comment:243

@mkoeppe
Copy link
Contributor

mkoeppe commented Aug 18, 2022

Changed reviewer from Sebastian Oehms, ... to Sebastian Oehms, ...

@mkoeppe
Copy link
Contributor

mkoeppe commented Aug 18, 2022

comment:19

Note that the module that is modified here, sage_setup.command.sage_build_ext, is not used when configure --enable-editable is in use (the default since #32406)

Also, even with configure --disable-editable, it's not clear to me that it would enough to disable these flags in Sage library or whether it needs to be done on the level of the whole distribution. (For example by setting CFLAGS etc. at configure time as the ticket description says.)

@soehms
Copy link
Member

soehms commented Nov 16, 2022

comment:21

FWIW: I made another try on the device I described in comment14 after installing LinuxMint 21 (vanessa) which is build on Ubuntu 22.04 (jammy). I did that on a fresh git clone of Sage 9.8.beta3 merged with the branch of the ticket. On a first step, I wanted to test with --enable-fat-binary but without --disable-editable. Accidentally, I typed --enable-sage-fat-binary instead of --enable-fat-binary. I guess that's the same as giving no option. Unexpectedly, it ends up with a pretty well working Sage.

Furthermore I tried two binary installations of stable 9.7. Using the docker image sagemath/sagemath:9.7 everything works fine, too, while if I use conda I'm getting an illegal instruction error on startup of Sage, as it was according to comment 14 with 9.7.beta0 after building from source under LinuxMint 19.3.

Apart from the latter fact it seems that the issue has been fixed for my device. I have no idea what could have caused that: The upgrade of the OS or changes in Sage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants