Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running examples in BOUT-2/3 #2608

Closed
alexaeg opened this issue Nov 3, 2022 · 3 comments
Closed

Running examples in BOUT-2/3 #2608

alexaeg opened this issue Nov 3, 2022 · 3 comments

Comments

@alexaeg
Copy link

alexaeg commented Nov 3, 2022

Hello everyone,

Recently I tried running examples supplied with the previous BOUT versions, 2 and 3.
The BOUT (and examples) compilation goes well, however running any of the simulation codes results in the immediate code crash (segfault) upon the call to the physics_run function.

It may be outdated to check such an issue arising within one of the older versions of BOUT, but having the build providing the ground to run previous simulations is good for benchmarking and code verification.

So if the question is relevant, I'll provide a more detailed information on the problem.

Best,
Alex

@ZedThree
Copy link
Member

ZedThree commented Nov 7, 2022

Hi @alexaeg, I'm afraid previous versions of BOUT++ are unsupported. The examples did work at the time, to the best of our knowledge, but we're unlikely to make any fixes to them now.

If you post the backtrace, the commit hash of the BOUT++ version you're using, and which example you're trying to run, we might be able to help though.

@alexaeg
Copy link
Author

alexaeg commented Nov 7, 2022

Thank you, @ZedThree.

The versions of the gcc and OS we are using are as follows:

anikeev@cluster21:~/bout/3/CCI_QOCS_Tup25eV_Rt0.9m-custom$ gcc --version
gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

anikeev@cluster21:~/bout/3/CCI_QOCS_Tup25eV_Rt0.9m-custom$ uname -a
Linux cluster21 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux

BOUT was downloaded from this location:

https://github.com/boutproject/BOUT-dev/archive/refs/tags/v3.0.tar.gz

For the test, our specialist at the HPC department has compiled the code with the GDB support and basic configuration of the code and got the following backtrace:

CXXFLAGS=-ggdb3 ./configure

The problem encountered is as follows (the code cci is one the custom made codes, however for the examples the similar output is obtained)

Thread 1 "cci" received signal SIGSEGV, Segmentation fault.
0x00005555555b5006 in Datafile::write_int (this=this@entry=0x555555b93e68, name=<error reading variable: Cannot access memory at address 0x9>,
f=0x7fffffffe090, grow=false) at datafile.cxx:605
605 file->write(f, name);
(gdb) p f
$1 = (int *) 0x7fffffffe090
(gdb) p name
$2 = <error reading variable: Cannot access memory at address 0x9>
(gdb) bt
#0 0x00005555555b5006 in Datafile::write_int (this=this@entry=0x555555b93e68,
name=<error reading variable: Cannot access memory at address 0x9>, f=0x7fffffffe090, grow=false) at datafile.cxx:605
#1 0x00005555555b8a87 in Datafile::write (this=this@entry=0x555555b93e68) at datafile.cxx:450
#2 0x00005555555de695 in Solver::call_monitors (this=this@entry=0x555555b93de0, simtime=0, iter=iter@entry=0, NOUT=400) at solver.cxx:734
#3 0x00005555555debcc in Solver::solve (this=this@entry=0x555555b93de0, NOUT=, NOUT@entry=-1, TIMESTEP=,
TIMESTEP@entry=0) at solver.cxx:524
#4 0x00005555555f4583 in bout_run (solver=0x555555b93de0, physics_run=) at bout++.cxx:328
#5 0x000055555558b443 in main (argc=, argv=) at /home/anikeev/bout/3/BOUT-dev-3.0//include/boutmain.hxx:130

A more detailed analysis of the problem with the debugger has revealed the following:

Thread 1 "cci" hit Breakpoint 1, Datafile::write_int (this=0x555555b93e68, name="hist_hi", f=0x555555b94168, grow=false) at datafile.cxx:601
601 bool Datafile::write_int(const string &name, int *f, bool grow) {
(gdb) p int_arr
$1 = std::vector of length 3, capacity 4 = {{ptr = 0x555555b94168, name = "hist_hi", grow = false, covar = 62}, {ptr = 0x555555b93e00,
name = "NPES", grow = false, covar = 62}, {ptr = 0x555555b8ac1c, name = "NXPE", grow = false, covar = 62}}
(gdb) p f
$2 = (int *) 0x555555b94168
(gdb) p *f
$3 = 0
(gdb) c
Continuing.

Thread 1 "cci" received signal SIGSEGV, Segmentation fault.
0x00005555555b5006 in Datafile::write_int (this=, name=<error reading variable: Cannot access memory at address 0x9>,
f=0x7fffffffe090, grow=false) at datafile.cxx:605
605 file->write(f, name);
(gdb) p int_arr
value has been optimized out
(gdb)

The BOUT++ compiled without any optimizations (e.g. with the --debug option) produces the stable code that can be safely run, so it seems that some of the optimizations may be interfering with the code. However, manually setting the gcc optimization flags step-by-step resulted in the segfault error upon running any of the examples, so the problem may be deeper than some of the opt flags breaking the code.

The code behavior (segfault) is identical either for BOUT v2 or v3. However, BOUT v2 obtained from here:

https://github.com/boutproject/BOUT-2.0.git

allows compiling and running the examples with the optimization flag turned on.

The whole question on running BOUT v2/3 arouse when trying to port the simulation code (cci) from version 2 to version 4. It turned out that although the simulation outputs are identical between the codes, I've run into the code convergence issue (absent in v2 simulations), when running the code compiled in BOUT v4. I talked to Benjamin on the problem, and it seems that some of the new BOUT machinery might be resulting in such a behavior of the code. I'll cover the issue in a separate thread.

@alexaeg
Copy link
Author

alexaeg commented Nov 16, 2022

Hello, guys,

I've been able to resolve the issue. Adding the return statement to the write_int and write_real functions allowed to preserve the variables that were initially optimized out by the compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants