Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use LinBox as native matrix representation for dense matrices over GF(p) #4260

Closed
malb opened this issue Oct 10, 2008 · 80 comments
Closed

use LinBox as native matrix representation for dense matrices over GF(p) #4260

malb opened this issue Oct 10, 2008 · 80 comments

Comments

@malb
Copy link
Member

malb commented Oct 10, 2008

Copying to and from LinBox uses up precious RAM and the point of fast linear algebra is to deal with large matrices. We should consider switching to LinBox as the native representation of matrices over GF(p)

Without Patch

sage: A = random_matrix(GF(97),2000,2000)
sage: %time A*A
CPU times: user 9.66 s, sys: 0.12 s, total: 9.77 s
Wall time: 9.82 s

With Patch

sage: A = random_matrix(GF(97),2000,2000)
sage: %time A*A
CPU times: user 1.32 s, sys: 0.00 s, total: 1.32 s
Wall time: 1.35 s

Magma

> A:=RandomMatrix(GF(97),2000,2000);
> time C:=A*A;                      
Time: 1.560

CC: @simon-king-jena @rbeezer @sagetrac-drkirkby

Component: linear algebra

Keywords: linbox, sd32, sd34

Author: Burcin Erocal, Martin Albrecht, Rob Beezer

Reviewer: Burcin Erocal, Simon King, Martin Albrecht, Jeroen Demeyer

Merged: sage-4.8.alpha3

Issue created by migration from https://trac.sagemath.org/ticket/4260

@ClementPernet
Copy link
Contributor

comment:1

I will work on it as a coding sprint at SD10.

@burcin
Copy link

burcin commented Aug 2, 2011

Author: Burcin Erocal

@burcin
Copy link

burcin commented Aug 2, 2011

comment:3

I finally rebased the patch from SD16. The template class in the patch contains the updates made to the modn_dense class since then, like changes to the sig_* functions. Apparently the modn_dense class representation now allows permuting the rows by changing pointers in the _matrix array. We can't allow that if we want to pass the _entries to linbox, so I skipped those changes.

Sage builds with the attached patches, and you can construct matrices. However, there are lots of bugs, some linbox wrappers are still stubs, etc. Expect crashes and wrong results.

With the patch applied, I get a crash with the following:

sage: a = matrix(GF(97),3,4,range(12))
sage: a.echelonize()
*** glibc detected *** python: free(): invalid next size (fast): 0x000000000270b370 ***
======= Backtrace: =========
<snip>

AFAICT, the new cython code is an exact copy of the wrapper function in linbox-sage.C. Here is what valgrind says:

==3026== Invalid write of size 8
==3026==    at 0x39E49EF1: T.4552 (ffpack_ludivine.inl:420)
==3026==    by 0x39E49AA0: T.4552 (ffpack_ludivine.inl:486)
==3026==    by 0x39E4ABBF: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26M
atrix_modn_dense_template_20_echelonize_linbox(_object*, _object*) (ffpack.h:113
2)
==3026==    by 0x4E74082: PyObject_Call (abstract.c:2492)
==3026==    by 0x39E2CA8A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26M
atrix_modn_dense_template_19echelonize(_object*, _object*, _object*) (matrix_mod
n_dense_double.cpp:8738)
==3026==    by 0x4F17FF9: PyEval_EvalFrameEx (ceval.c:3706)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
==3026==    by 0x4F19DB1: PyEval_EvalCode (ceval.c:522)
==3026==    by 0x4F19083: PyEval_EvalFrameEx (ceval.c:4401)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
<snip lots more Py* lines>
==3026==  Address 0x6ca16e8 is 0 bytes after a block of size 24 alloc'd
==3026==    at 0x4C267CE: malloc (vg_replace_malloc.c:236)
==3026==    by 0x39E4AA5A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26Matrix_modn_dense_template_20_echelonize_linbox(_object*, _object*) (memory.h:32)
==3026==    by 0x4E74082: PyObject_Call (abstract.c:2492)
==3026==    by 0x39E2CA8A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26Matrix_modn_dense_template_19echelonize(_object*, _object*, _object*) (matrix_modn_dense_double.cpp:8738)
==3026==    by 0x4F17FF9: PyEval_EvalFrameEx (ceval.c:3706)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
==3026==    by 0x4F19DB1: PyEval_EvalCode (ceval.c:522)
==3026==    by 0x4F19083: PyEval_EvalFrameEx (ceval.c:4401)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
==3026==    by 0x4F18074: PyEval_EvalFrameEx (ceval.c:3802)
<snip lots of Py* lines>

I'd appreciate any pointers about the problem above, though I don't know if I'll have the time to come back to this before the bug days in August (when I presume Martin will take over?).

@rbeezer
Copy link
Mannequin

rbeezer mannequin commented Aug 22, 2011

comment:4

These are the files in sage/matrix with failures:

        sage -t  devel/sage-main/sage/matrix/matrix_cyclo_dense.pyx # 22 doctests failed
        sage -t  devel/sage-main/sage/matrix/strassen.pyx # 2 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix0.pyx # 2 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_integer_dense.pyx # 5 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_space.py # 1 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_window_modn_dense.pyx # 1 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_modn_sparse.pyx # 1 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_integer_dense_saturation.py # 0 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_rational_dense.pyx # 44 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix2.pyx # Time out
        sage -t  devel/sage-main/sage/matrix/matrix_modn_dense.pyx # Time out
        sage -t  devel/sage-main/sage/matrix/matrix_modn_dense_template.pxi # Time out

@malb
Copy link
Member Author

malb commented Aug 23, 2011

Attachment: trac_4260-linbox_default.patch.gz

make matrix space constructor use the new classes

@malb

This comment has been minimized.

@malb
Copy link
Member Author

malb commented Aug 23, 2011

comment:5

I fixed a few issues and segfaults but the thing is far from done. However, one can probably do higher level stuff now, i.e. it shouldn't crash that much any more.

We need a new LinBox SPKG because Modular<float> didn't have a NonZeroRandIter which is needed by the charpoly code. LinBox 1.1.7 fixes this issue but I tried unsuccessfully to upgrade to 1.1.7 for like 10 hours (cf. #11718).

@malb

This comment has been minimized.

@malb
Copy link
Member Author

malb commented Aug 24, 2011

comment:7

Doctest failures with most recent patch on sage.math:

        sage -t  -long -force_lib devel/sage/doc/de/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/doc/en/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/doc/en/bordeaux_2008/modular_forms_and_hecke_operators.rst # 1 doctests failed
        sage -t  -long -force_lib devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed
        sage -t  -long -force_lib devel/sage/doc/fr/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/doc/ru/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/heilbronn.pyx # 2 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/tests.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/space.py # 18 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/eisenstein_submodule.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/tests.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/constructor.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/space.py # 8 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/ambient.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/cuspidal_submodule.py # 6 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/ambient.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/element.py # 11 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/element.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/module.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/matrix/matrix_cyclo_dense.pyx # 8 doctests failed
        sage -t  -long -force_lib devel/sage/sage/matrix/matrix2.pyx # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/tests/cmdline.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed

@malb
Copy link
Member Author

malb commented Aug 24, 2011

Changed author from Burcin Erocal to Burcin Erocal, Martin Albrecht

@malb

This comment has been minimized.

@malb
Copy link
Member Author

malb commented Aug 24, 2011

comment:8

I've fixed all the easy stuff which brings the doctest failures down to:

sage -t  -long devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed
sage -t  -long devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed
sage -t  -long devel/sage/sage/modular/modsym/space.py # 18 doctests failed
sage -t  -long devel/sage/sage/modular/modform/eisenstein_submodule.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/modform/constructor.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/modform/space.py # 8 doctests failed
sage -t  -long devel/sage/sage/modular/modform/ambient.py # 4 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/element.py # 1 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/module.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed
sage -t  -long devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed
sage -t  -long devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed

many of which seem to be caused by a small set of bugs.

@malb
Copy link
Member Author

malb commented Aug 24, 2011

comment:9

Here's where we are at on sage.math:

sage -t  devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed
sage -t  devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed
sage -t  devel/sage/sage/modular/modsym/space.py # 12 doctests failed
sage -t  devel/sage/sage/modular/modform/eisenstein_submodule.py # 1 doctests failed
sage -t  devel/sage/sage/modular/modform/space.py # 7 doctests failed
sage -t  devel/sage/sage/modular/modform/constructor.py # 1 doctests failed
sage -t  devel/sage/sage/modular/modform/ambient.py # 4 doctests failed
sage -t  devel/sage/sage/modular/hecke/element.py # 1 doctests failed
sage -t  devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed
sage -t  devel/sage/sage/modular/hecke/module.py # 3 doctests failed
sage -t  devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
sage -t  devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed
sage -t  devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed
sage -t  devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed
sage -t  devel/sage/sage/structure/sage_object.pyx # 1 doctests failed
sage -t  devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed
sage -t  devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed

@williamstein
Copy link
Contributor

Work Issues: sd32

@malb
Copy link
Member Author

malb commented Aug 25, 2011

comment:11

With the updated patch we are down to:

sage -t  -long devel/sage/sage/modular/modsym/heilbronn.pyx # 2 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/homology.py # 3 doctests failed

However, there also seems to be a doctest failure in matrix2.pyx which is not that easily reproduced.

@malb
Copy link
Member Author

malb commented Aug 25, 2011

comment:12

Now all doctests should pass!

@williamstein
Copy link
Contributor

Changed keywords from linbox to linbox, sd32

@williamstein
Copy link
Contributor

Changed work issues from sd32 to none

@malb
Copy link
Member Author

malb commented Aug 25, 2011

Work Issues: extend documentation

@malb
Copy link
Member Author

malb commented Aug 25, 2011

Changed author from Burcin Erocal, Martin Albrecht to Burcin Erocal, Martin Albrecht, Rob Beezer

@malb

This comment has been minimized.

@jdemeyer
Copy link

comment:43

Works on OS X 10.4 PPC, so positive review.

@jdemeyer
Copy link

comment:44

This crashes all over the place on OpenSolaris 06.2009-32 (hawk). For example:

sage -t -long  -force_lib devel/sage/sage/rings/qqbar.py
**********************************************************************
File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage-main/sage/rings/qqbar.py", line 241:
    sage: r.imag().minpoly() # this takes a long time (143s on my laptop)
Exception raised:
    Traceback (most recent call last):
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_0[74]>", line 1, in <module>
        r.imag().minpoly() # this takes a long time (143s on my laptop)###line 241:
    sage: r.imag().minpoly() # this takes a long time (143s on my laptop)
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/lib/python/site-packages/sage/rings/qqbar.py", line 2873, in minpoly
        self._minimal_polynomial = self._descr.minpoly()
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/lib/python/site-packages/sage/rings/qqbar.py", line 5406, in minpoly
        self._minpoly = self._value.minpoly()
      File "number_field_element.pyx", line 3495, in sage.rings.number_field.number_field_element.NumberFieldElement_absolute.minpoly (sage/rings/number_field/number_field_element.cpp:21939)
      File "number_field_element.pyx", line 3462, in sage.rings.number_field.number_field_element.NumberFieldElement_absolute.charpoly (sage/rings/number_field/number_field_element.cpp:21816)
      File "matrix_rational_dense.pyx", line 936, in sage.matrix.matrix_rational_dense.Matrix_rational_dense.charpoly (sage/matrix/matrix_rational_dense.c:10895)
      File "matrix_integer_dense.pyx", line 1017, in sage.matrix.matrix_integer_dense.Matrix_integer_dense.charpoly (sage/matrix/matrix_integer_dense.c:10961)
      File "matrix_integer_dense.pyx", line 1074, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._charpoly_linbox (sage/matrix/matrix_integer_dense.c:11601)
      File "matrix_integer_dense.pyx", line 1096, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._poly_linbox (sage/matrix/matrix_integer_dense.c:11869)
    RuntimeError: Segmentation fault
**********************************************************************

There are many more like this.

@malb
Copy link
Member Author

malb commented Nov 17, 2011

comment:45

Mhh, the trouble is in Matrix_integer_dense, which isn't what this ticket is about, so that's curious. How do I log into hawk?

@jdemeyer
Copy link

comment:46

I still want to investigate some more, for example I have not checked that it is really this ticket which causes the problems (but you do see "linbox" appearing in the backtrace).

Strangely, even building the documentation crashes:

sphinx-build -b html -d /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/doctrees/en/reference   -A hide_pdf_links=1 /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/en/reference /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference
Running Sphinx v1.1.2
loading pickled environment... not yet created
building [html]: targets for 935 source files that are out of date
updating environment: 935 added, 0 changed, 0 removed
reading sources... [  0%] algebras
reading sources... [  0%] arithgroup
reading sources... [  0%] calculus
reading sources... [  0%] categories
reading sources... [  0%] cmd
reading sources... [  0%] coding
reading sources... [  0%] coercion
reading sources... [  0%] combinat/algebra
reading sources... [  0%] combinat/crystals
[...]
writing additional files... genindex py-modindex search
copying images... [ 16%] sage/graphs/../../media/heawood-graph-latex.png
copying images... [ 33%] sage/homology/../../media/homology/simplices.png
copying images... [ 50%] sage/homology/../../media/homology/torus.png
copying images... [ 66%] sage/homology/../../media/homology/klein.png
copying images... [ 83%] sage/homology/../../media/homology/rp2.png
copying images... [100%] sage/homology/../../media/homology/torus_labelled.png

copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

------------------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
Build finished.  The built documents can be found in /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference

@jdemeyer
Copy link

Changed merged from sage-4.8.alpha2 to none

@jdemeyer jdemeyer reopened this Nov 20, 2011
@jdemeyer
Copy link

comment:48

Replying to @malb:

Mhh, the trouble is in Matrix_integer_dense, which isn't what this ticket is about, so that's curious. How do I log into hawk?

Hawk is a machine from David Kirkby, so you should ask him.

@malb
Copy link
Member Author

malb commented Nov 22, 2011

comment:49

Okay, I managed to build 4.8-alpha2 + this ticket on hawk. Just starting and stopping Sage gives:

#0  0xfec9c7fb in _free_unlocked () from /lib/libc.so.1
#1  0xfec9c7af in free () from /lib/libc.so.1
#2  0xfdb81d01 in operator delete (ptr=0x8) at ../../../../gcc-4.5.0/libstdc++-v3/libsupc++/del_op.cc:44
#3  0xfdb81d5d in operator delete[] (ptr=0x8) at ../../../../gcc-4.5.0/libstdc++-v3/libsupc++/del_opv.cc:32
#4  0xfdb72543 in ~ios_base (this=0xf9a3e704) at ../../../../gcc-4.5.0/libstdc++-v3/src/ios.cc:93
#5  0xf9892891 in __static_initialization_and_destruction_0 (__initialize_p=<value optimized out>)
    at /usr/local/gcc-4.5.0/lib/gcc/i386-pc-solaris2.10/4.5.0/../../../../include/c++/4.5.0/bits/basic_ios.h:272
#6  0xf988e3b0 in __do_global_dtors_aux () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0
#7  0xf99f7835 in _fini () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0
#8  0xfefd15fe in call_fini () from /usr/lib/ld.so.1
#9  0xfefd17b3 in atexit_fini () from /usr/lib/ld.so.1
#10 0xfec8370c in _exithandle () from /lib/libc.so.1
#11 0xfec73f52 in exit () from /lib/libc.so.1
#12 0xfeef3232 in Py_Exit (sts=0) at Python/pythonrun.c:1716
#13 0xfeef3357 in handle_system_exit () at Python/pythonrun.c:1116
#14 0x00000000 in ?? ()

So it tries to clean up LinBox at the end and that's when things go wrong:

_fini () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0

any ideas about why?

@malb
Copy link
Member Author

malb commented Nov 23, 2011

comment:50

Weird, I rebuilt everything from scratch using these environment variables

SAGE_PARALLEL_SPKG_BUILD=yes
LD_LIBRARY_PATH=/usr/local/lib
PATH=/usr/local/bins-for-sage:/usr/local/bin:/usr/bin:/bin
MAKE=make -j4

and now

All tests passed!
Total time for all tests: 1742.8 seconds

i.e., the segfault is gone. How does the buildbot build Sage?

@jdemeyer
Copy link

comment:51

Replying to @malb:

i.e., the segfault is gone. How does the buildbot build Sage?

EDITOR=emacs
HISTCONTROL=ignoreboth
HISTSIZE=2000
HOME=/export/home/buildbot
IGNOREEOF=100
LANG=C
LD_LIBRARY_PATH=/usr/local/gcc-4.5.0/lib:/usr/local/gcc-4.5.0/lib/amd64
LESS=iMqR
LESSHISTFILE=-
LOGNAME=buildbot
MAIL=/var/mail/buildbot
MAKE=make -j12
MAKEOPTS=-j12
PAGER=/usr/bin/less
PATH=/export/home/buildbot/bin:/export/home/buildbot/local/hawk/bin:/usr/local/bins-for-sage:/usr/local/gcc-4.5.0/bin:/usr/local/bin:/usr/local/texlive/2010/bin/i386-solaris/:/usr/bin:/usr/sbin
PWD=/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2
SAGE_ATLAS_LIB=/ATLAS32
SAGE_FORTRAN=/usr/local/gcc-4.5.0/bin/gfortran
SAGE_FORTRAN_LIB=/usr/local/gcc-4.5.0/lib/libgfortran.so
SAGE_PARALLEL_SPKG_BUILD=yes
SAGE_PORT=true
SHELL=/bin/bash
SHLVL=1
SSH_CLIENT=128.208.160.197 44994 22
SSH_CONNECTION=128.208.160.197 44994 192.168.1.191 22
SSH_TTY=/dev/pts/2
TERM=screen
TZ=Europe/London
USER=buildbot
VIRTUAL_ENV=/export/home/buildbot/local/hawk
VIRTUAL_ENV_DISABLE_PROMPT=yes
VISUAL=emacs

@malb
Copy link
Member Author

malb commented Nov 27, 2011

comment:52

Perhaps, it's a GCC 4.5.0 issue?

@jdemeyer
Copy link

comment:53

Replying to @malb:

Perhaps, it's a GCC 4.5.0 issue?

Could very well be.

What does your gcc --version say? (the gcc you used to compile Linbox successfully)

@malb
Copy link
Member Author

malb commented Nov 27, 2011

comment:54
$ gcc --version
gcc (GCC) 4.4.3 20100112 (prerelease)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@malb
Copy link
Member Author

malb commented Nov 29, 2011

comment:55

I can confirm that this bug is at least triggered by GCC 4.5.

Here's the relevant bits of the env that I used to build Sage + this ticket just now:

SAGE_PARALLEL_SPKG_BUILD=yes
LD_LIBRARY_PATH=/usr/local/gcc-4.5.0/lib:/usr/local/gcc-4.5.0/lib/amd64
PATH=/usr/local/gcc-4.5.0/bin:/usr/local/bins-for-sage/:/usr/local/bin:/usr/bin:/usr/sbin
MAKE=make -j8

and this one crashes with a SIGSEGV. Whereas the env posted above by doesn't.

I am not sure what to do about this? Ask Dave to install a newer GCC to test whether it fails with it as well?

@jdemeyer
Copy link

comment:56

Replying to @malb:

I am not sure what to do about this? Ask Dave to install a newer GCC to test whether it fails with it as well?

That's not a bad suggestion, asking to install gcc 4.5.3 for example (the latest in the 4.5 series)

@malb
Copy link
Member Author

malb commented Nov 30, 2011

comment:57

I conclude it's a compiler bug: I just built with:

SAGE_PARALLEL_SPKG_BUILD=yes
LD_LIBRARY_PATH=/usr/local/gcc-4.6.0/lib:/usr/local/gcc-4.6.0/lib/amd64
PATH=/usr/local/gcc-4.6.0/bin:/usr/local/bins-for-sage/:/usr/local/bin:/usr/bin:/usr/sbin
MAKE=make -j8

and

All tests passed!
Total time for all tests: 1786.1 seconds

I suggest to avoid 4.5.0 (at least on OpenSolaris) and to change the status of this ticket back to positive review.

@jdemeyer
Copy link

jdemeyer commented Dec 1, 2011

Merged: sage-4.8.alpha3

@jdemeyer
Copy link

jdemeyer commented Dec 1, 2011

comment:60

Testing again on hawk...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants