Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document / detect required compiler versions per configuration #17

Closed
tkelman opened this issue Jul 17, 2014 · 20 comments
Closed

Document / detect required compiler versions per configuration #17

tkelman opened this issue Jul 17, 2014 · 20 comments

Comments

@tkelman
Copy link
Contributor

tkelman commented Jul 17, 2014

Title was: Build failure on piledriver

BLIS commit a41e68e
OS: Ubuntu 12.04 (on Travis CI)
CPU: AMD Opteron 6376
Compiler: GCC 4.6.3

Short version of the error message is:

Compiling frame/0/setsc/bli_setsc_check.c
frame/0/getsc/bli_getsc_check.c:1:0: error: bad value (bdver2) for -march= switch

(repeated several times for other files in frame/0)

Full log is at https://travis-ci.org/tkelman/BLIS.jl/jobs/30217818

@Maratyszcza
Copy link
Contributor

-march=bdver2 is supported starting with gcc 4.7. For gcc 4.6 the best
alternative is likely -march=bdver1 -mfma

@tkelman
Copy link
Contributor Author

tkelman commented Jul 17, 2014

Ah yes, that makes sense. Guess this is just a documentation issue, or possibly the build system could be improved to probe for whether these flags are functional during configure.

@jeffhammond
Copy link
Member

jeffhammond commented Jul 17, 2014

This is something that could be readily addressed with autotools ;-)

@tkelman tkelman changed the title Build failure on piledriver Document / detect required compiler versions per configuration Jul 17, 2014
@tkelman
Copy link
Contributor Author

tkelman commented Jul 17, 2014

Using GCC 4.8.1 from a PPA fixed the build failure, but gives a ** On entry to SGEMV , parameter number 6 had an illegal value error when running tests (my Julia binding tests, BLIS' own tests run okay). Unfortunately I don't have access to a Piledriver machine locally, but will see if I can come up with a standalone repro case.

@fgvanzee
Copy link
Member

This is something that could be readily addressed with autotools ;-)

What I think you really mean is autoconf in particular (automake and libtool would not come into play). Even still, this would require a total redesign of BLIS's build system (which is a discussion for another time). When I was using autoconf to build libflame's build system, I didn't find it all that user-friendly. Sure, you could probably write a macro that handles this particular case, but it still requires extensive case handling and quickly becomes hackish. In the spirit of your recent exchange with Carter, I wouldn't have said anything if you had simply said, "this is possible with autoconf/autotools." But readily addressable? That's upper-level management speak! :)

@jeffhammond
Copy link
Member

On Jul 18, 2014, at 10:30 AM, "Field G. Van Zee" notifications@github.com wrote:

This is something that could be readily addressed with autotools ;-)

What I think you really mean is autoconf in particular (automake and libtool would be come into play). Even still, this would require a total redesign of BLIS's build system (which is a discussion for another time). When I was using autoconf to build libflame's build system, I didn't find it all that user-friendly. Sure, you could probably write a macro that handles this particular case, but it still requires extensive case handling and quickly becomes hackish. In the spirit of your recent exchange with Carter, I wouldn't have said anything if you had simply said, "this is possible with autoconf/autotools." But readily addressable? That's upper-level management speak! :)

:-)

I will take a crack at this some time. I've been down the painful path of autotools before. I don't expect you to divert effort for it.

Jeff


Reply to this email directly or view it on GitHub.

@tkelman
Copy link
Contributor Author

tkelman commented Jul 19, 2014

And I should've guessed parameter number 6 to SGEMV means an integer size problem. bli_info_get_int_type_size_str works nicely for that. Not sure how things were able to work with the reference configuration and the wrong integer size. Also interesting to note that Travis' Linux machines are a mix of Bulldozer and Piledriver, with Piledriver apparently more common right now.

@isuruf
Copy link
Contributor

isuruf commented Mar 20, 2019

Need to document clang versions needed as well. (clang 4.0.1 was enough for BLIS 0.5.1, but not for 0.5.2 due to the use of -mno-lwp.)

cc @fgvanzee

@fgvanzee
Copy link
Member

Perhaps it would be sufficient to add blacklisting capabilities to configure, as is currently implemented for gcc? See the check_compiler() function in configure starting around line 1347 for how this is currently done for gcc. I imagine the logic for clang would be much simpler, for now at least.

@isuruf Since I don't use clang, I'm not very familiar with what versions are typically available. How old is 4.0.1? What is the next version of clang that works for 0.5.2 (if you know)?

Another route we might take is to actually filter out certain flags if clang is being used so that older versions still work. The -mno-lwp flag isn't that important, and it only matters for AMD sub-configs, and even then, it's not critical that LWP be disabled (same for TBM and XOP). In fact, if clang doesn't support -mno-lwp, it probably doesn't support LWP at all, in which case the flag is 100% unnecessary.

@devinamatthews
Copy link
Member

Couldn't we just add the -mno-{lwp,tbm,xop} flags to the specific flags for the respective AMD configurations? Then, whatever compiler version is necessary for a particular configuration should also understand those flags.

@fgvanzee
Copy link
Member

Couldn't we just add the -mno-{lwp,tbm,xop} flags to the specific flags for the respective AMD configurations?

Those flags are already part of the make_defs.mk for the AMD subconfigs. @isuruf can correct me if I'm wrong, but I think the question is: do we want this to be the reason that we require a newer version of clang? Or do we try to get by with older versions when those flags are not important? (Notice that I don't know how reasonable or unreasonable it is to require "new" versions of clang since I have so little history with it.)

@devinamatthews
Copy link
Member

Hmmm, I had assumed that if the -march flag being used was capable of triggering the generation of these instructions, then the flags to turn them off would necessarily be recognized by that compiler version. If this is not the case then perhaps it is a limited problem that has a more specific solution? @isuruf can you provide some details on the problem you ran into?

@isuruf
Copy link
Contributor

isuruf commented Mar 24, 2019

Here's the error,

clang-4.0 -O3 -mavx2 -mfpmath=sse -mfma -march=bdver4 -mno-fma4 -mno-tbm -mno-xop -mno-lwp -Wall -Wno-unused-function -Wfatal-errors -Wno-tautological-compare -fPIC -fvisibility=hidden -std=c99 -D_POSIX_C_SOURCE=200112L -Iinclude/x86_64 -I./frame/3/ -I./frame/ind/ukernels/ -I./frame/1m/ -I./frame/1f/ -I./frame/1/ -I./frame/include -DBLIS_VERSION_STRING=\"0.4.1-230\" -DBLIS_IS_BUILDING_LIBRARY -c kernels/zen/1/bli_amaxv_zen_int.c -o obj/x86_64/kernels/zen/1/bli_amaxv_zen_int.o

Even though the compiler recognizes -march=bdver4 it doesn't recognize -mno-lwp

@isuruf
Copy link
Contributor

isuruf commented Mar 24, 2019

clang-3.9 (Released Sep 2016) works if I remove -mno-lwp. (Blacklists knl, skx)
clang-4.0 (Released March 2017) works if I remove -mno-lwp. (Incorrectly blacklists knl, skx)
clang-5.0 (Released Sep 2017) works. (Incorrectly blacklists knl, skx)
clang-6.0 (Released March 2018) works.

Blacklisting of knl, skx can be fixed with a small patch.

@devinamatthews
Copy link
Member

@fgvanzee Instead of using the compiler version to blacklist things, set flags, etc. maybe we should just do a test compile and check the status. e.g.:

# 
# Check for -mno-lwp
# 

cat > conftest.c <<EOF
<test program>
EOF

if $CC -c -mno-lwp conftest.c; then
    MNO_LWP_FLAG=-mno-lwp
else
    MNO_LWP_FLAG=
fi

Of course then we are reimplementing yet another function of autoconf but that is a conversation for another day...

@fgvanzee
Copy link
Member

@devinamatthews You make a fair point about autoconf. However, I'm not ready to take that plunge yet, mostly because things are basically working as-is and there wouldn't be a lot of immediate payoff.

For now, I think we should patch Isuru's issue and move on. @isuruf Does #305 indirectly fix this issue for you? It wasn't immediately clear to me.

@isuruf
Copy link
Contributor

isuruf commented Mar 25, 2019

@fgvanzee, no. -mno-lwp is still an issue for clang<5.

@fgvanzee
Copy link
Member

What if we simply nixed the flag for the clang branch of the make_defs.mk files affected? i.e. here is the one for zen:

ifeq ($(CC_VENDOR),gcc)
# gcc 6.0 (clang 4.0) or later:
#CKVECFLAGS     := -mavx2 -mfpmath=sse -mfma -march=znver1
# gcc 4.9 (clang 3.5) or later:
# possibly add zen-specific instructions: -mclzero -madx -mrdseed -mmwaitx -msha -mxsavec -mxsaves -mclflushopt -mpopcnt
CKVECFLAGS     := -mavx2 -mfpmath=sse -mfma -march=bdver4 -mno-fma4 -mno-tbm -mno-xop -mno-lwp
else
ifeq ($(CC_VENDOR),clang)
# We omit -mno-lwp because some versions of clang don't recognize the flag.
CKVECFLAGS     := -mavx2 -mfpmath=sse -mfma -march=bdver4 -mno-fma4 -mno-tbm -mno-xop
else
$(error gcc or clang are required for this configuration.)
endif
endif

@fgvanzee
Copy link
Member

BTW, I suggest this blanket removal of the flag for all versions of clang because I don't think the compiler naturally emits LWP instructions into BLIS. (It's not super-easy to find documentation on those instructions, but they appear to be related to "lightweight profiling".)

@jeffhammond
Copy link
Member

I believe this issue has been addressed by the compiler version blacklisting in configure. Please reopen this issue if necessary.

niyas-sait pushed a commit to niyas-sait/blis that referenced this issue Feb 25, 2022
Add CI for Python 3.8
Aaron-Hutchinson added a commit to sifive/sifive-blis that referenced this issue Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants