You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A refactor in #27946 introduced "unprotected" (not surrounded by GAP_Enter/GAP_Leave) GAP_ValueGlobalVariable calls. I believe this might be a GC hazard, because after updating to GAP 4.12.1 I started seeing aarch64 crashes on NixOS infrastructure such as:
#0 0x0000fffff79740e8 in wait4 ()
#1 0x0000fffff5dc6b78 in print_enhanced_backtrace ()
#2 0x0000fffff5dc8190 in sigdie ()
#3 0x0000fffff5dcb1c0 in cysigs_signal_handler ()
#4 0x0000fffff7ffb7cc in __kernel_rt_sigreturn ()
#5 0x0000ffff99a0bf28 in ConvString ()
#6 0x0000000000000000 in ?? ()
#7 0x0000000000000000 in ?? ()
#8 0x0000000000000000 in ?? ()
#9 0x0000ffff99989930 in Pr ()
#10 0x0000ffff9998aa18 in CloseOutput ()
#11 0x0000ffff99884828 in capture_stdout () at /build/sage-src-9.7/pkgs/sagemath-standard/sage/libs/gap/element.pyx:154
...
I also see cases where capture_stdout throws errors such as sage.libs.gap.util.GAPError: Error, Length: <list> must be a list (not the integer 255) and then crashes. Both types of errors are fixed by this ticket.
Note that I am nesting GAP_Enter/GAP_Leave calls because I didn't remove the preexisting calls inside capture_stdout. That's because I feared removing the innermost calls might create a new footgun (and I believe nested GAP_Enter/GAP_Leave calls are explicitly supported), but removing them should cause no problem. Removing them might even be preferable for performance reasons, I don't know.
(Edit: I previously thought that holding onto a char* past the GAP_Enter/GAP_Leave blocks, as is done in GapElement's __str__ and _repr_ methods, could also cause GC hazards. It doesn't seem to be a problem in practice, though, and I am not qualified to tell if it's a problem in theory.)
A refactor in #27946 introduced "unprotected" (not surrounded by
GAP_Enter
/GAP_Leave
)GAP_ValueGlobalVariable
calls. I believe this might be a GC hazard, because after updating to GAP 4.12.1 I started seeing aarch64 crashes on NixOS infrastructure such as:I also see cases where
capture_stdout
throws errors such assage.libs.gap.util.GAPError: Error, Length: <list> must be a list (not the integer 255)
and then crashes. Both types of errors are fixed by this ticket.Note that I am nesting
GAP_Enter
/GAP_Leave
calls because I didn't remove the preexisting calls insidecapture_stdout
. That's because I feared removing the innermost calls might create a new footgun (and I believe nestedGAP_Enter
/GAP_Leave
calls are explicitly supported), but removing them should cause no problem. Removing them might even be preferable for performance reasons, I don't know.This ticket's branch is based on #34391.
(Edit: I previously thought that holding onto a
char*
past theGAP_Enter
/GAP_Leave
blocks, as is done in GapElement's__str__
and_repr_
methods, could also cause GC hazards. It doesn't seem to be a problem in practice, though, and I am not qualified to tell if it's a problem in theory.)Depends on #34391
CC: @embray @videlec @dimpase
Component: interfaces
Author: Mauricio Collares
Issue created by migration from https://trac.sagemath.org/ticket/34701
Edit: Removed branch. The code changes are now at #35114.
The text was updated successfully, but these errors were encountered: