Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fedora-30-standard: Doctests using system brial crash #29490

Closed
mkoeppe opened this issue Apr 9, 2020 · 26 comments
Closed

fedora-30-standard: Doctests using system brial crash #29490

mkoeppe opened this issue Apr 9, 2020 · 26 comments

Comments

@mkoeppe
Copy link
Contributor

mkoeppe commented Apr 9, 2020

Follow-up from #29369: I see crashes in running doctests on fedora-30-standard (https://github.com/mkoeppe/sage/runs/572856797), which uses system brial 1.2.5-1

sage -t src/sage/rings/polynomial/multi_polynomial_sequence.py  # Killed due to abort
sage -t src/sage/rings/polynomial/pbori.pyx  # Killed due to abort
sage -t src/sage/rings/polynomial/polynomial_ring_constructor.py  # Killed due to abort
sage -t src/sage/sat/boolean_polynomials.py  # Killed due to abort
sage -t src/sage/sat/solvers/dimacs.py  # Killed due to abort

We remove the spkg-configure for Sage 9.1.

A follow up ticket can restore and repair it for 9.2.

CC: @orlitzky @embray @kiwifb @mkoeppe @dimpase @kliem @SnarkBoojum

Component: porting

Author: Matthias Koeppe

Branch: 8679b65

Reviewer: Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/29490

@mkoeppe mkoeppe added this to the sage-9.1 milestone Apr 9, 2020
@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 17, 2020

comment:3

This ticket needs some attention

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 18, 2020

comment:4

If this cannot be fixed, we can certainly also disable the brial spkg-configure for the 9.1 release...

@kliem
Copy link
Contributor

kliem commented Apr 18, 2020

comment:5

This error is all the same on fedora 31.

The only thing I can find is that debian patches brial since very long with this

Description: Protect CErrorInfo::text() against invalid array access
 * The test suite tests this in PBoRiErrorTest.cc, but the tests
   failed only on the ia64 and alpha architectures.
Author: Tobias Hansen <thansen@debian.org>

--- a/libbrial/src/CErrorInfo.cc
+++ b/libbrial/src/CErrorInfo.cc
@@ -47,7 +47,7 @@
 CErrorInfo::text(errornum_type num) {
 
   PBORI_TRACE_FUNC( "CErrorInfo::text(errornum_type) const" );
-  if PBORI_UNLIKELY(num > CTypes::last_error)
+  if PBORI_UNLIKELY(num < 0 || num >= CTypes::last_error)
     return "Unknown error occured.";
 
   return pErrorText[num];

fedora doesn't patch at all. This patch isn't included into brial until version 1.2.8.

@kliem
Copy link
Contributor

kliem commented Apr 18, 2020

comment:6

Can we check for this at configuration time?

@dimpase
Copy link
Member

dimpase commented Apr 19, 2020

comment:7

There is no direct way known to trigger this, the best I know is

2020-04-18T21:46:00.4537962Z sage -t src/sage/rings/polynomial/multi_polynomial_sequence.py
2020-04-18T21:46:00.4538239Z     Killed due to abort
2020-04-18T21:46:00.4538512Z **********************************************************************
2020-04-18T21:46:00.4538774Z Tests run before process (pid=31088) failed:
2020-04-18T21:46:00.4539023Z sage: sr = mq.SR(2,1,2,4,gf2=True,polybori=True) ## line 26 ##
2020-04-18T21:46:00.4539280Z sage: sr ## line 27 ##
2020-04-18T21:46:00.4539527Z SR(2,1,2,4)
2020-04-18T21:46:00.4539777Z sage: set_random_seed(1) ## line 33 ##
2020-04-18T21:46:00.4540187Z sage: F,s = sr.polynomial_system() ## line 34 ##
2020-04-18T21:46:00.4541037Z terminate called after throwing an instance of 'std::bad_alloc'
2020-04-18T21:46:00.4541926Z   what():  std::bad_alloc
2020-04-18T21:46:00.4542456Z ------------------------------------------------------------------------
2020-04-18T21:46:00.4542936Z /sage/local/lib/python3.7/site-packages/cysignals/signals.cpython-37m-x86_64-linux-gnu.so(+0x9158)[0x7fae23047158]
2020-04-18T21:46:00.4543404Z /sage/local/lib/python3.7/site-packages/cysignals/signals.cpython-37m-x86_64-linux-gnu.so(+0x91f9)[0x7fae230471f9]
2020-04-18T21:46:00.4543853Z /sage/local/lib/python3.7/site-packages/cysignals/signals.cpython-37m-x86_64-linux-gnu.so(+0xcb6d)[0x7fae2304ab6d]
2020-04-18T21:46:00.4544069Z /lib64/libpthread.so.0(+0x12c70)[0x7fae2479fc70]
2020-04-18T21:46:00.4544278Z /lib64/libc.so.6(gsignal+0x145)[0x7fae244abe35]
2020-04-18T21:46:00.4544478Z /lib64/libc.so.6(abort+0x127)[0x7fae24496895]
2020-04-18T21:46:00.4544683Z /lib64/libstdc++.so.6(+0x9e756)[0x7fae1b20a756]
2020-04-18T21:46:00.4545000Z /lib64/libstdc++.so.6(+0xaa6dc)[0x7fae1b2166dc]
2020-04-18T21:46:00.4545187Z /lib64/libstdc++.so.6(+0xaa747)[0x7fae1b216747]
2020-04-18T21:46:00.4545383Z /lib64/libstdc++.so.6(+0xaa9b9)[0x7fae1b2169b9]
2020-04-18T21:46:00.4545582Z /lib64/libstdc++.so.6(+0x9e3c6)[0x7fae1b20a3c6]
2020-04-18T21:46:00.4546100Z /sage/local/lib/python3.7/site-packages/sage/rings/polynomial/pbori.cpython-37m-x86_64-linux-gnu.so(_ZNSt5dequeIN8polybori14CCuddNavigatorESaIS1_EE17_M_reallocate_mapEmb+0xba)[0x7fade0ab8d0a]
2020-04-18T21:46:00.4546654Z /sage/local/lib/python3.7/site-packages/sage/rings/polynomial/pbori.cpython-37m-x86_64-linux-gnu.so(_ZNSt5dequeIN8polybori14CCuddNavigatorESaIS1_EE24_M_new_elements_at_frontEm+0xc8)[0x7fade0ab8fa8]
2020-04-18T21:46:00.4547288Z /sage/local/lib/python3.7/site-packages/sage/rings/polynomial/pbori.cpython-37m-x86_64-linux-gnu.so(_ZNSt5dequeIN8polybori14CCuddNavigatorESaIS1_EE19_M_range_insert_auxISt15_Deque_iteratorIS1_RKS1_PS6_EEEvS5_IS1_RS1_PS1_ET_SD_St20forward_iterator_tag+0x37f)[0x7fade0aba1ff]
2020-04-18T21:46:00.4547572Z /lib64/libbrial.so.3(_ZN8polybori13CDegTermStackINS_14CCuddNavigatorENS_9valid_tagENS_11invalid_tagENS_18CAbstractStackBaseIS1_EEE8findTermEm+0x51d)[0x7fade099f1bd]
2020-04-18T21:46:00.4547822Z /lib64/libbrial.so.3(_ZN8polybori13CDegTermStackINS_14CCuddNavigatorENS_9valid_tagENS_11invalid_tagENS_18CAbstractStackBaseIS1_EEE9incrementEv+0x412)[0x7fade099f832]
2020-04-18T21:46:00.4548072Z 
...

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 19, 2020

comment:8

Could you prepare a test based on this crash please?

@dimpase
Copy link
Member

dimpase commented Apr 20, 2020

comment:9

perhaps a test based on CErrorInfo::text in comment:5 is doable.

@kliem
Copy link
Contributor

kliem commented Apr 20, 2020

comment:10

Sorry. I think I was completely wrong with comment:5.

Fedora claims to use the exact same source files as we do. Maybe some linking didn't work out.

Replying to @dimpase:

perhaps a test based on CErrorInfo::text in comment:5 is doable.

@kliem
Copy link
Contributor

kliem commented Apr 20, 2020

comment:11

A wild guess it that fedora somehow messed up this:

https://src.fedoraproject.org/rpms/brial/blob/f30/f/brial.spec

Maybe here

	
%build
	
export CPPFLAGS="-DPBORI_NDEBUG"
	
%configure --enable-shared --disable-static
	
# Get rid of undesirable hardcoded rpaths, and workaround libtool reordering
	
# -Wl,--as-needed after all the libraries.
	
sed -e 's|^hardcode_libdir_flag_spec=.*|hardcode_libdir_flag_spec=""|g' \
	
    -e 's|^runpath_var=LD_RUN_PATH|runpath_var=DIE_RPATH_DIE|g' \
	
    -e 's|CC="\(g..\)"|CC="\1 -Wl,--as-needed"|' \
	
    -i libtool

But I really have no clue.

@dimpase
Copy link
Member

dimpase commented Apr 20, 2020

comment:12

Replying to @dimpase:

perhaps a test based on CErrorInfo::text in comment:5 is doable.

no, apparently it is not the case, this patch is there.
I tried the following on Fedora 30 (brial 1.2.5)

#include <iostream>
#include <polybori/except/CErrorInfo.h>
int main () {
	polybori::CErrorInfo foo;
    std::cout << polybori::CTypes::last_error << "\n";
	std::cout << foo.text(-1) << "\n";
	return 1;
}

and it prints

clpc171[/tmp]$ g++ bri.cc -lbrial
clpc171[/tmp]$ ./a.out 
12
Unknown error occured.

no crash.

@kliem
Copy link
Contributor

kliem commented Apr 20, 2020

comment:13

I think this is because -1 is interpreted as unsigned. At least my compiler interprets errornum_type as unsigned. This would also explain why this specific patch is needed only on some seldom architectures.

@kliem
Copy link
Contributor

kliem commented Apr 20, 2020

comment:14

And I don't think this patch is present in the current sage either.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 20, 2020

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 20, 2020

Author: Matthias Koeppe

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 20, 2020

Changed branch from u/mkoeppe/fedora_30_standard__doctests_using_system_brial_crash to none

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 20, 2020

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 20, 2020

New commits:

8679b65Back out brial spkg-configure.m4 for 9.1

@mkoeppe

This comment has been minimized.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 20, 2020

Commit: 8679b65

@dimpase
Copy link
Member

dimpase commented Apr 21, 2020

Reviewer: Dima Pasechnik

@dimpase
Copy link
Member

dimpase commented Apr 21, 2020

comment:19

OK, let's skip brial for the time being.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Apr 21, 2020

comment:20

Thanks!

@vbraun
Copy link
Member

vbraun commented Apr 23, 2020

@dimpase
Copy link
Member

dimpase commented Apr 28, 2020

Changed commit from 8679b65 to none

@dimpase
Copy link
Member

dimpase commented Apr 28, 2020

comment:22

See also comment:19 in #28349

@dimpase
Copy link
Member

dimpase commented Jun 4, 2020

comment:23

follow, with a better fix, on ##29792

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants