Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

activate LLVM37 #9336

Closed
19 tasks done
vtjnash opened this issue Dec 13, 2014 · 143 comments · Fixed by #14623
Closed
19 tasks done

activate LLVM37 #9336

vtjnash opened this issue Dec 13, 2014 · 143 comments · Fixed by #14623
Labels
needs decision A decision on this change is needed

Comments

@vtjnash
Copy link
Member

vtjnash commented Dec 13, 2014

this is an umbrella issue for tracking known issues with switching to LLVM35 as default. please edit this post or add a comment below. close when someone edits deps/Versions.make to 3.6.0 3.7.0

current known issues:

related:

@Keno

@vtjnash vtjnash added needs decision A decision on this change is needed help wanted Indicates that a maintainer wants help on an issue or pull request fixed in LLVM 3.5 labels Dec 13, 2014
@tkelman
Copy link
Contributor

tkelman commented Dec 13, 2014

Also good to note that 3.5.1 just entered rc phase: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-December/079673.html

[EDIT: Removed checklist as it was moved to the list above - @Keno]

@Keno
Copy link
Member

Keno commented Dec 13, 2014

We won't have backtraces on 3.5, isn't that pretty much a nogo?

@vtjnash
Copy link
Member Author

vtjnash commented Dec 14, 2014

ok, with my new commits, that pretty much covers tkelman's issues

@timholy
Copy link
Member

timholy commented Dec 14, 2014

Really cool

@tkelman
Copy link
Contributor

tkelman commented Dec 14, 2014

I am still getting the numbers test failure on latest master, win64. #7728 (comment)

@vtjnash
Copy link
Member Author

vtjnash commented Dec 14, 2014

that might be platform dependent. also, have you done a full rebuild of the sysimg? (fwiw, the numbers test seems to just hang for me)

@tkelman
Copy link
Contributor

tkelman commented Dec 14, 2014

Was on a sandy bridge, and I did make cleanall beforehand. I'm recompiling LLVM 3.5.0 from scratch now in case our compile flags have changed in some way.

@vtjnash
Copy link
Member Author

vtjnash commented Dec 14, 2014

they haven't (afaics), i just wanted to make sure you weren't pulling in any cached sys.dll code that might have been affected by the llvm copyprop bug

@tkelman
Copy link
Contributor

tkelman commented Dec 25, 2014

I'm going to guess 04893a1 or something similar may have fixed the numbers test failure with LLVM 3.5.0, but I'm getting linalg failures now - https://gist.github.com/tkelman/8c409a7083531765027c

@vtjnash
Copy link
Member Author

vtjnash commented Dec 29, 2014

i messed up the alignment in ca15a28, and some other stuff too. it was probably the cause of that failure. fixed in 8cea71e

@tkelman
Copy link
Contributor

tkelman commented Dec 29, 2014

I was wondering why the numbers test looked broken again, thought I was going a little nuts.

@ihnorton
Copy link
Member

We should probably also take a look at performance metrics before flipping the switch. On my llvm-svn build, building the sys image (touch base/sysimg.jl && time make -j2) takes 4 minutes as compared to 2 minutes on LLVM3.3

@Keno
Copy link
Member

Keno commented Jan 24, 2015

That's because we run all the passes twice ;). But yes, that's a TODO item.

@ihnorton
Copy link
Member

Derp. Well that makes sense then!

@vtjnash
Copy link
Member Author

vtjnash commented Jan 30, 2015

@Keno is that just applicable building the sysimg?

@Keno
Copy link
Member

Keno commented Jan 30, 2015

No, MCJIT does not yet have a way to alter the passes that it runs as far as I'm aware (though I was promised this would be possible in the near future), this means that e.g. our SIMD lowering pass would not be run if we didn't run all the passes ourselves (I suppose we could just set MCJIT's opt level to None, but I think that actually also disabled optimizations in the backend).

@vtjnash
Copy link
Member Author

vtjnash commented Jan 30, 2015

ah, i see that now. It appears we would need to create a TargetMachine wrapper class and overload addPassesToEmitMC in order to set the passes list. Not very direct, but also not very difficult.

@Keno
Copy link
Member

Keno commented Jan 30, 2015

Yes, that would probably work.

@vtjnash
Copy link
Member Author

vtjnash commented Jan 30, 2015

so if someone patches that, and we merge your patches for #7910 into the https://github.com/JuliaLang/llvm branch, i think we might be ready to switch to llvm35

@Keno
Copy link
Member

Keno commented Jan 30, 2015

Sounds right to me.

@tkelman
Copy link
Contributor

tkelman commented Feb 28, 2015

Should we be shooting for 3.6.0 now that it's out? http://llvm.org/releases/

@vtjnash vtjnash changed the title activate LLVM35 activate LLVM36 Feb 28, 2015
@vtjnash
Copy link
Member Author

vtjnash commented Mar 1, 2015

it looks like we could create a micro-branch to trivially address the remaining item on the above list and make the switch

@tkelman
Copy link
Contributor

tkelman commented Mar 1, 2015

Anybody else see a failure on the complex test with 3.6.0?

exception on 3: ERROR: LoadError: test failed: isequal(sqrt(complex(0.0,-0.0)),complex(0.0,-0.0))
 in expression: isequal(sqrt(complex(0.0,-0.0)),complex(0.0,-0.0))
 in error at ./error.jl:19
 in default_handler at ./test.jl:27
 in do_test at ./test.jl:50
 in runtests at /home/tkelman/Julia/julia/test/testdefs.jl:79
 in anonymous at ./multi.jl:833
 in run_work_thunk at ./multi.jl:584
 in anonymous at ./multi.jl:833
while loading complex.jl, in expression starting on line 7
        From worker 2:       * broadcast            in  27.41 seconds
ERROR: LoadError: LoadError: test failed: isequal(sqrt(complex(0.0,-0.0)),complex(0.0,-0.0))
 in expression: isequal(sqrt(complex(0.0,-0.0)),complex(0.0,-0.0))
 in anonymous at ./task.jl:1370
while loading complex.jl, in expression starting on line 7
while loading /home/tkelman/Julia/julia/test/runtests.jl, in expression starting on line 3

        From worker 3:       * complex             make[1]: *** [all] Error 1
make: *** [test] Error 2

tkelman@ygdesk:~/Julia/julia$ ./julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+3637 (2015-03-01 18:30 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 74c71fb* (0 days old master)
|__/                   |  x86_64-linux-gnu

julia> versioninfo()
Julia Version 0.4.0-dev+3637
Commit 74c71fb* (2015-03-01 18:30 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT NO_AFFINITY PENRYN)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.6.0

julia> sqrt(complex(0.0,-0.0))
0.0 + 0.0im

julia> isequal(sqrt(complex(0.0,-0.0)), complex(0.0,-0.0))
false

@yuyichao
Copy link
Contributor

As long as you can reproduce this with a single (cpp) file, it's probably fine to just use that in the bug report report to GCC?

@Keno
Copy link
Member

Keno commented Nov 28, 2015

Yeah, I will

@Keno
Copy link
Member

Keno commented Nov 28, 2015

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68601. In the meantime I'll try my luck with clang. At least there I can fix things if they break

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

May also be worth trying a cmake build rather than autotools, it's possible they pay closer attention to having flags right in cmake since autotools is now officially deprecated upstream.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

Perhaps. I think for now I'll wait for GCC 5.3 and see if that fixes it. Also it seems like Clang works fine, so maybe we should just switch.

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

Much easier said than done in terms of getting that working on the buildbots and ensuring binaries are compatible with winrpm packages. We could put cross-built llvm and clang up on opensuse without too much work, but using that would require switching the buildbots from cygwin-cross to msys2 which went poorly the last time we tested it.

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

I guess we could build https://github.com/tpoechtrager/wclang for cygwin-to-mingw-clang cross right after building our own llvm and clang, but then we'd have to do a second or third stage rebuilding llvm with the just built clang.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

The version of clang doesn't have to be the same as we're building. We can build it once.

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

Once per machine where Julia needs to be built from source. Would rather minimize manual error-prone and time consuming steps of building toolchains, and use/create binary packages wherever we can. Cygwin has a build of clang and llvm but it's usually out of date.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

There's prebuilt binaries also.

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

those are built against msvc though, wouldn't we need clang built against mingw-w64? and those binaries don't include the llvm libraries (which would be needed for a cygwin-build cross compile driver) last I checked - we've asked before about including them but doesn't seem upstream cares about binaries of anything other than clang-cl.exe on windows.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

They can be built against any compiler you want as long as you're using it only as a compiler.

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

Using clang for the windows build should probably move to a separate issue at this point. What I don't think will work is building a cygwin cross compiler wrapper of wclang (so the buildbots will continue to work) against an msvc binary of llvm. I'll have to try it though.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

Yes, you are probably right, but for a linux buildbot, you can just use the linux clang. Clang doesn't have different compilers based on the target triple.

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

Right, which makes it not very useful as a cross compiler driver due to not handling the runtime headers and libraries of the host system, and wclang fixes that so we could use it in our build system for linux or cygwin cross.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

What is needed that isn't fixed by -target FOO -isysroot BAR?

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

I don't think -isysroot is enough to make windows.h and all the linker stubs for win32 dlls work? If it were that easy, wclang probably wouldn't need to exist? If you can get a clang cross build to work in our build system and all our deps with just those flags, I'd be pleasantly surprised.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

I'm not sure. It was enough on msys2 to get clang running, but that's obviously different.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

Update on the GCC issue for those not following. It's already a known bug, and is causing by compiling LLVM at -O2, julia at -O0 and not using LLVM shared libraries, so if we want to work around this issue we just have to avoid doing any of those three things.

@nalimilan
Copy link
Member

Does using LLVM shared libraries cause any issue these days? I set USE_LLVM_SHLIB=1 for RPM packages, and it seems to work fine.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

Works fine on Mac/Linux, I think it still needs a patch on windows (which is the relevant platform for this problem).

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2015

Can we try again to bump that patch and get it upstreamed by 3.8 or are we running out of time? We only build julia-releasedebug at -O0 right? Using llvm shlibs would reduce a lot of size duplication in libjulia and libjulia-debug.

@Keno
Copy link
Member

Keno commented Nov 29, 2015

LLVM 3.8 is still wide open, but since autotools is being deprecated, we should probably check what the status is with CMake.

@vtjnash
Copy link
Member Author

vtjnash commented Nov 29, 2015

XP support will also be gone

@StefanKarpinski
Copy link
Member

We can also switch to LLVM 3.7 on platforms where it's doable and keep using LLVM 3.4 on others. We need to pull the trigger on this (at least partially) soon – this is holding up so much progress.

@petercolberg
Copy link
Contributor

Please see #14191, neither LLVM 3.4 (segfault) nor LLVM 3.7 (10 times slower) work on Linux x86.

Switching from 3.3 to 3.8 seems the only way to go for Linux x86.

@tkelman
Copy link
Contributor

tkelman commented Dec 1, 2015

It should be noted that based on download statistics of the tarball/dmg/exe binaries, Linux x86 is by far (1 or 2 orders of magnitude IIRC) the least used version of Julia out of the 5 platforms we build binaries for and test on CI. So a partial switch on just linux x86_64, and maybe mac, to start with might make sense. #13569 should allow us to build a newer LLVM on Linux Travis without having to package it in a PPA, as long as a first source build can go through within the time limit (from then on it'll be cached). Travis OSX will need to wait for packaging the newer LLVM as a bottle, either by staticfloat or homebrew proper (https://github.com/Homebrew/homebrew-versions/pull/955).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.