Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic malloc errors with CSDP 4.1 #39

Closed
mgreiff opened this issue Mar 8, 2019 · 2 comments · Fixed by #48
Closed

Sporadic malloc errors with CSDP 4.1 #39

mgreiff opened this issue Mar 8, 2019 · 2 comments · Fixed by #48

Comments

@mgreiff
Copy link

mgreiff commented Mar 8, 2019

I should preface this by saying that I'm a novice when it comes to optimization in Julia and JuMP.

I ran into some problems when using CSDP to solve a set of LMIs related H-infinity synthesis in robust control. To illustrate the problem, I have created a short script (see CSDP_issue_malloc.txt) which solves the same problem over and over again in a for-loop. It generates the correct solution repeatedly, until it eventually fails due to one of two reasons: a malloc issue (see printout 1), or not being able to find a feasible solution (see printout 2).

Typically these failures occur after 50-100 correctly solved problems. I have solved the same problem using CVX, as well as the and the interior-point methods by Nesterov used in Matlab's LMI solvers, in both cases yielding a gain of gamma=1.45. So the CSDP solver seems to work nicely, apart from the two sporadically occurring errors. I have run the same experiments with using SCS in Julia, and this works wihthout malloc errors, so the problem seems to be related to CSDP.jl and not JuMP.jl.

The first error might be related to your issue #2 , with the malloc leading to a segfault in my case, but I know too little about the internals of your CSDP-wrapper to accurately debug it. I thought I'd alert you to this issue, and would be thankful for any ideas on what may be causing the problems.

printout 1

~~~ Run 71 ~~~
CSDP 6.2.0
Iter:  0 Ap: 0.00e+00 Pobj: -5.0377634e+02 Ad: 0.00e+00 Dobj:  0.0000000e+00
Iter:  1 Ap: 1.00e+00 Pobj: -5.9563600e+02 Ad: 7.35e-01 Dobj: -2.4754075e-01
Iter:  2 Ap: 1.00e+00 Pobj: -5.9529382e+02 Ad: 9.48e-01 Dobj:  2.6619412e-03
julia(35362,0x7fff75c76000) malloc: *** error for object 0x7fdbbad12008:
incorrect checksum for freed object - object was probably modified after being
  freed. *** set a breakpoint in malloc_error_break to debug

signal (6): Abort trap: 6
in expression starting at no file:0
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 98455385 (Pool: 98433939; Big: 21446); GC: 291
Abort trap: 6

printout 2

~~~ Run 49 ~~~
CSDP 6.2.0
Iter:  0 Ap: 0.00e+00 Pobj: -5.0377634e+02 Ad: 0.00e+00 Dobj:  0.0000000e+00
Iter:  1 Ap: 1.00e+00 Pobj: -5.9695547e+02 Ad: 7.35e-01 Dobj: -8.6584995e-01
.
.
.
Iter: 50 Ap: 7.86e-09 Pobj: -1.1346790e+18 Ad: 2.32e-08 Dobj: -8.7559876e+03
Stuck at edge of primal feasibility, giving up. 
Test Failed at REPL[10]:6
  Expression: abs(G - γ) <= 0.001
   Evaluated: 595.502673809033 <= 0.001
ERROR: There was an error during testing
@blegat
Copy link
Member

blegat commented Mar 9, 2019

Thanks for the detailed report. I have also noticed sporadic failures with CSDP but haven't found the root cause yet. See for instance
https://github.com/JuliaOpt/SumOfSquares.jl/issues/52
One idea could be to try to do something similar to jump-dev/Cbc.jl#101 and see if it resolves the issue.

@ericphanson
Copy link

I am not sure if it is the same bug or not, but I also got a segfault using CSDP partway through an optimization with Pajarito, where CSDP is the continuous solver (using Gurobi as the MIP solver). It ran for a few minutes and I could see CSDP had solved many problems without incident during the course of the optimization, and then on one of the problems it immediately segfaulted. So it seems like a sporadic problem also. The same overall optimization problem worked fine when I substituted CSDP for Mosek. I'll include the stacktrace below in case it's helpful:

signal (11): Segmentation fault: 11
in expression starting at no file:0
op_a at /Users/eh540/.julia/packages/CSDP/7O621/deps/usr/lib/libcsdp.dylib (unknown line)
pinfeas at /Users/eh540/.julia/packages/CSDP/7O621/deps/usr/lib/libcsdp.dylib (unknown line)
sdp at /Users/eh540/.julia/packages/CSDP/7O621/deps/usr/lib/libcsdp.dylib (unknown line)
sdp at /Users/eh540/.julia/packages/CSDP/7O621/src/declarations.h.jl:294
sdp at /Users/eh540/.julia/packages/CSDP/7O621/src/declarations.jl:62
unknown function (ip: 0x131a5c0da)
sdp at /Users/eh540/.julia/packages/CSDP/7O621/src/declarations.jl:43
optimize! at /Users/eh540/.julia/packages/CSDP/7O621/src/MPB_wrapper.jl:73
optimize! at /Users/eh540/.julia/packages/SemidefiniteModels/31KiO/src/sd_to_conic.jl:313
unknown function (ip: 0x131a582cb)
solve_subp! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1829
solve_subp_add_subp_cuts! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1707
callback_lazy at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1452
lazycallback at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/callbacks.jl:78
#130 at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/callbacks.jl:96
mastercallback at /Users/eh540/.julia/packages/Gurobi/dlJep/src/MPB_wrapper.jl:706
unknown function (ip: 0x131a61b4b)
PRIVATE00000000005cbaca at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE000000000047f5f0 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE0000000000483efe at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000003ef899 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE0000000000443909 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000003c86c5 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000005b9298 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000005b8dcf at /usr/local/lib/libgurobi81.dylib (unknown line)
GRBoptimize at /usr/local/lib/libgurobi81.dylib (unknown line)
optimize! at /Users/eh540/.julia/packages/Gurobi/dlJep/src/grb_solve.jl:5
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
#solve#120 at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/solvers.jl:175
#solve at ./none:0 [inlined]
solve_mip_driven! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1493
optimize! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:659
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
#solve#120 at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/solvers.jl:175
unknown function (ip: 0x125fe4ad9)
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
solve at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/solvers.jl:150
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
do_call at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:323
eval_stmt_value at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:362 [inlined]
eval_body at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:759
jl_interpret_toplevel_thunk_callback at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:885
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x11a51590f)
unknown function (ip: 0xffffffffffffffff)
jl_interpret_toplevel_thunk at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:894
jl_toplevel_eval_flex at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:764
jl_toplevel_eval at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:773 [inlined]
jl_toplevel_eval_in at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:793
eval at ./boot.jl:328
eval_user_input at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:85
run_backend at /Users/eh540/.julia/packages/Revise/SOSpn/src/Revise.jl:842
#68 at ./task.jl:259
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
jl_apply at /Users/osx/buildbot/slave/package_osx64/build/src/./julia.h:1571 [inlined]
start_task at /Users/osx/buildbot/slave/package_osx64/build/src/task.c:572
Allocations: 109787744 (Pool: 109751500; Big: 36244); GC: 219
Segmentation fault: 11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants