Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offshore boundary plane failing time tolerance #1342

Open
rybchuk opened this issue Nov 11, 2024 · 16 comments · May be fixed by #1425
Open

Offshore boundary plane failing time tolerance #1342

rybchuk opened this issue Nov 11, 2024 · 16 comments · May be fixed by #1425
Assignees
Labels
bug:amr-wind Something isn't working

Comments

@rybchuk
Copy link
Contributor

rybchuk commented Nov 11, 2024

Bug description

Likely related to this PR, and I am using the most recent commit on main, 740679c8. I'm running an offshore ABL simulation with linear waves. I have achieved a spun-up state at iteration 80000, and now I am trying to run an inflow-outflow simulation. I believe that I have successfully written out boundary planes using the following:

ABL.bndry_file                           = bndry_file.native
ABL.bndry_io_mode                        = 0
ABL.bndry_planes                         = ylo xlo
ABL.bndry_output_start_time              = 1.0
ABL.bndry_var_names                      = velocity_mueff velocity density p velocity_src_term gp vof_mueff levelset interface_curvature vof vof_src_term temperature_mueff temperature temperature_src_term ow_levelset ow_vof ow_velocity

(I know that I'm probably saving out too many boundary variables, but I saw the same error with = velocity temperature vof)

I then kick off the inflow-outflow simulation using the following variables:

ABL.bndry_file                           = ../write_lev0/bndry_file.native
ABL.bndry_io_mode                        = 1
ABL.bndry_planes                         = ylo xlo
ABL.bndry_output_start_time              = 1.0
ABL.bndry_var_names                      = velocity_mueff velocity density p velocity_src_term gp vof_mueff levelset interface_curvature vof vof_src_term temperature_mueff temperature temperature_src_term ow_levelset ow_vof ow_velocity # velocity temperature vof

The simulation errors out during the first timestep.

Steps to reproduce

I have shared all the files needed to run the bc-write simulation and bc-read simulation in /scratch/orybchuk/share/debug_inflow_outflow on Kestrel. I've uploaded many of the files (aside from the large ones) here too.

debug_inflow_outflow.zip

Steps to reproduce the behavior:

  1. Compiler used
    • Oneapi
  2. Operating system
    • Linux
  3. Hardware:
    • CPU
  4. Machine details ():
    Kestrel
Currently Loaded Modules:
  1) intel/2023.2.0	  7) cray-libsci/22.10.1.2
  2) craype/2.7.30        8) PrgEnv-intel/8.5.0
  3) cray-dsmml/0.2.2     9) cray-python/3.11.5
  4) libfabric/1.15.2.0  10) binutils/2.41
  5) craype-network-ofi  11) intel-oneapi-compilers/2023.2.0
  6) cray-mpich/8.1.28
  1. Input file attachments: See the .zip file.
  2. Error (paste or attach):
==============================================================================
Step: 80011 dt: 0.05 Time: 6542.942863 to 6542.992863
CFL: 0.402303 (conv: 0.402303 diff: 0 src: 0 )

Godunov:
  System                     Iters      Initial residual        Final residual
  ----------------------------------------------------------------------------
terminate called after throwing an instance of 'std::runtime_error'
...
  what():  Assertion `std::abs(time - m_in_data.tinterp()) < 1e-12' failed, file "/kfs2/projects/ai4wind/orybchuk/exawind-manager/environments/ai4wind-nov24/amr-wind/amr-wind/wind_energy/ABLBoundaryPlane.cpp", line 896
  1. If this is a segfault, a stack trace from a debug build (paste or attach):
    N/A

Expected behavior

The bc-read simulation should run past the first timestamp without crashing.

AMR-Wind information

==============================================================================
                AMR-Wind (https://github.com/exawind/amr-wind)

  AMR-Wind version :: v3.2.0-8-g740679c8-DIRTY
  AMR-Wind Git SHA :: 740679c87925db41521f477e8724daa0c69cc670-DIRTY
  AMReX version    :: 24.09-45-g6d9c25b989f1

  Exec. time	   :: Mon Nov 11 10:48:20 2024
  Build time	   :: Nov 11 2024 10:17:52
  C++ compiler     :: IntelLLVM 2023.2.0

  MPI              :: ON    (Num. ranks = 384)
  GPU              :: OFF
  OpenMP           :: OFF

  Enabled third-party libraries:
    NetCDF    4.9.2

           This software is released under the BSD 3-clause license.
 See https://github.com/Exawind/amr-wind/blob/development/LICENSE for details.
------------------------------------------------------------------------------

Additional context

I tried initializing the bc-read simulation from chk80000 and chk80010, and it fails for both checkpoints.

For the record, the timing of chk80010 looks like Step: 80011 dt: 0.05 Time: 6542.942863 to 6542.992863, and the first few timestamps of the BC are:

80000 6542.4428627258721
80001 6542.5721400248704
80002 6542.6563112443018

so the BC definitely has data before the IC. The precursor simulation was run with an adaptive timestamp, but the bc-write and bc-read simulations are currently being run with a small fixed timestamp. I was also hitting the same error with a version of AMR-Wind that I compiled from main in October.

@rybchuk rybchuk added the bug:amr-wind Something isn't working label Nov 11, 2024
@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 12, 2024

I did some more investigating, but have not been able to figure out a quick fix. I changed the time tolerance from 1e-12 to the aggressive 1e-4, and that didn't work.

  what():  Assertion `std::abs(time - m_in_data.tinterp()) < 1e-4' failed, file "/kfs2/projects/ai4wind/orybchuk/exawind-manager/environments/ai4wind-nov24/amr-wind/amr-wind/wind_energy/ABLBoundaryPlane.cpp", line 900

I also added some print statements into ABLBoundaryPlane.cpp to see the values of everything:

    amrex::Print() << "time: " << time << std::endl;
    amrex::Print() << "m_in_data.tn(): " << m_in_data.tn() << std::endl;
    amrex::Print() << "m_in_data.tnp1(): " << m_in_data.tnp1() << std::endl;
    amrex::Print() << "m_in_data.tinterp(): " << m_in_data.tinterp() << std::endl;
    amrex::Print() << "std::abs(time - m_in_data.tinterp()): " << std::abs(time - m_in_data.tinterp()) << std::endl;
    AMREX_ALWAYS_ASSERT(
        ((m_in_data.tn() <= time) || (time <= m_in_data.tnp1())));
    AMREX_ALWAYS_ASSERT(std::abs(time - m_in_data.tinterp()) < 1e-6);

Surprisingly, the values of time and m_in_data.tinterp() are printed out as identical and the difference truly appears to be 0:

time: 6542.942863
m_in_data.tn(): -1
m_in_data.tnp1(): -1
m_in_data.tinterp(): 6542.942863
std::abs(time - m_in_data.tinterp()): 0

I assume that m_in_data.tn() and m_in_data.tnp1() are supposed to be -1?


One more update: if I comment out AMREX_ALWAYS_ASSERT(std::abs(time - m_in_data.tinterp()) < 1e-6);, then my simulation does run through to timestepping. It's unclear yet if I'm seeing any unphysical artifacts.

@mbkuhn
Copy link
Contributor

mbkuhn commented Nov 12, 2024

I'm confident I found the problem. The time passed to a vof fillpatch (within the advection step) is not valid, leading to a problem with this assertion. I'm not sure what the best solution is, but to confirm this is the problem, you could try running without vof as one of the bndry variables. This would also explain why this problem is showing up for offshore ABLs in particular.

@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 12, 2024

Yep, I can confirm! If I run with my code that was compiled back in October and I set ABL.bndry_var_names = velocity temperature, the code makes it through the first few timesteps without any errors.

@mbkuhn
Copy link
Contributor

mbkuhn commented Nov 12, 2024

Putting these here for myself and other developers:
https://github.com/Exawind/amr-wind/blob/main/amr-wind/equation_systems/vof/SplitAdvection.cpp#L102
https://github.com/Exawind/amr-wind/blob/main/amr-wind/equation_systems/vof/vof_ops.H#L73

both of these would lead to breaking the assertion. The second one is easy to fix. The first one is a problem for two reasons:

  1. easier issue - need to pass in the time to SplitAdvection so it can have a valid value.
  2. harder issue - vof boundaries need to be at n, not n+1/2, defying the new way I set up the boundary fill times. Probably will need some type of specific exception to leave the boundaries at n while still doing fillpatch stuff elsewhere, kind of like what I did for the sibling fields fill.

@mbkuhn
Copy link
Contributor

mbkuhn commented Nov 12, 2024

When you get a chance, could you also check with the more recent code, and just taking vof out of the input bndry variable names? No hurry

@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 12, 2024

It's a quick thing for me to check :) I un-uncommented the AMREX_ALWAYS_ASSERT with yesterday's code, and I can confirm that that also runs through the first few timestamps without issue. And, the code fails if I add vof back into the BCs.

@hgopalan
Copy link
Contributor

Adding to our previous discussion, I randomly get a code crash at :

AMREX_ALWAYS_ASSERT(
    ((m_in_data.tn() <= time) || (time <= m_in_data.tnp1())));

It happens on both CPUs and GPUs. I would say 10% of my cases see this issue.

@mbkuhn
Copy link
Contributor

mbkuhn commented Nov 14, 2024

@hgopalan could you make a separate issue with a testable example? That's a different problem than what Alex is seeing.

@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 20, 2024

So... when I submitted the issue, I unfortunately set up my inflow-outflow simulation incorrectly. While I set ABL.bndry_io_mode = 1, I was also running geometry.is_periodic = 1 1 0. As a result, it's ambiguous if I was accidentally running my inflow-outflow simulation as a periodic simulation.

I'm now trying to run the inflow-outflow simulation with:

ABL.bndry_io_mode                        = 1
ABL.bndry_planes                         = xlo
...
geometry.is_periodic                     = 0   1   0

xlo.type                                 = mass_inflow
xlo.density                              = 1.027
xlo.temperature                          = 265.0
xlo.tke                                  = 0.0
xhi.type                                 = pressure_outflow

The simulation now errors out with

AMReX (24.09-45-g6d9c25b989f1) initialized
Initializing AMR-Wind ...
ParmParse::sgetarr ParmParse::sgetarr(): xlo.vof not found in table

This error is being reported by amrex/Src/Base/AMReX_ParmParse.cpp. Do you have any suggestions on what to do about xlo.vof? I don't see any examples of this line in the repo. Also, what should I do about xlo.density when MultiPhase expects two densities? And should xhi.type still be pressure_outflow?

@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 20, 2024

Harish suggested setting xlo.vof = 0 (where 0 is a dummy value), and my simulation is now running again :) I hope to have results soon with a longer inflow-outflow simulation

@mbkuhn
Copy link
Contributor

mbkuhn commented Nov 20, 2024

Thanks, @hgopalan, that's a great suggestion for this scenario

@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 20, 2024

I ran the simulation out for 200 iterations (10 seconds). Based off the vof field by the end of the simulation, the numerical beach is indeed damping the waves. Here's pressure at that last step too.

Unfortunately, u, v, w, and T look funny at the outflow. This feels like I botched something with the inflow-outflow .i file though. I haven't run inflow-outflow simulations in a bit, so let me confirm that I remember how to properly run them on land.

@mbkuhn
Copy link
Contributor

mbkuhn commented Nov 20, 2024

are you using ABL Mean Boussinesq in there too?

@rybchuk
Copy link
Contributor Author

rybchuk commented Nov 25, 2024

Nope, good catch.

I've added ABL Mean Boussinesq in there. I've also run the calc_inflow_stats.py script but commented out all the ABL lines because these values feel like they're specific to land-based ABLs with Monin-Obukhov.

ICNS.source_terms                        = BoussinesqBuoyancy CoriolisForcing GeostrophicForcing GravityForcing ABLMeanBoussinesq
##--------- Additions by calc_inflow_stats.py ---------#
# ABL.wall_shear_stress_type = "local"
# ABL.inflow_outflow_mode = true
# ABL.wf_velocity = -0.04061319424974897 0.01560499433783477
# ABL.wf_vmag = 0.05284818547325939
# ABL.wf_theta = 265.0000040814566
# BodyForce.magnitude = 0.0 0.0 0.0
BoussinesqBuoyancy.read_temperature_profile = true
BoussinesqBuoyancy.tprofile_filename = avg_theta.dat
##-----------------------------------------------------#

After making that change, my inflow-outflow simulation ran for the full requested two minutes without any obvious issues (u, v, w, T)!

Whereas the above simulations didn't have any refinement zones, I also ran ~20 seconds of an inflow-outflow simulation with two levels of refinement. The u, v, w, and T fields also look good there!

I've attached the read and write .i files below for reference for posterity.
offshore_io.zip


I'm going to do some more futzing around now, where I take BCs from a spun-up simulation and I try to drive a simulation with quiescent ICs, but that's not relevant to this issue. The VoF fillpatch workaround that I've been using looks like it works great for the time being, so there's no pressing need from me to thinking about the VoF fillpatch code more rigorously. Thanks for the help!

@marchdf
Copy link
Contributor

marchdf commented Nov 25, 2024

That's awesome news @rybchuk . Good to close the issue then?

@mbkuhn mbkuhn linked a pull request Dec 23, 2024 that will close this issue
13 tasks
@mbkuhn
Copy link
Contributor

mbkuhn commented Dec 23, 2024

@rybchuk, do you think you could try out the above pull request to see if it passes the asserts when you include vof as a boundary variable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug:amr-wind Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants