Redesigning `Simulation` and `run!` #1138

glwagner · 2020-11-03T15:25:50Z

@ali-ramadhan and I have discussed a potential redesign of Simulation and run! that has several key features:

Coalesce all of the "callbacks" (arbitrary functions that are executed during a time-stepping loop) other than OutputWriters into a single list. Current objects that we can classify / redesign as callbacks are: stop criteria, TimeStepWizard, and diagnostics. All callbacks are required to possess callback.schedule, and we can provide convenience objects for coupling simple callback functions to AbstractSchedule.
Within the time-stepping loop, execute simulation.callbacks prior to writing output. This ensures that data calculated in a callback can be output during the same time-step, such as WindowedTimeAverage and other non-local-in-time output.
Wrap the time-stepping loop inside a try / catch block and throw exceptions to stop a simulation. This generalizes the concept of stopping a simulation and also means that a simulation can be stopped inside any callback. Further, when an AbstractStopException is called we will loop over the OutputWriter callbacks a final time, passing the exception into the OutputWriter callback functions. This allows output behavior specialized on the type of exception. For example:

If NaNsDetected is thrown, no output will be written.
If WallTimeExceeded is thrown, the checkpointer may write output.

simulation.Δt becomes a number corresponding to the next time-step, always (rather than sometimes being a TimeStepWizard). The TimeStepWizard callback changes this number on its schedule. Otherwise, the time-step is held constant. This changes the API, since the initial time-step must now be provided to Simulation.
(Somewhat unrelated, but enabled by the new pattern) Use a new function align!(simulation.Δt, writer.schedule, simulation.model) to adjust a subsequent time-step if output writing is scheduled. This ensures output writing on TimeIntervals will always be on schedule rather than chronically late as it is now. Since output is called after all the other callbacks, the output writers get the final say as to the next time-step.

These changes will break the existing API. Notably, we'll use

push!(simulation.callbacks, TimeStepWizard(cfl=1))

rather than setting simulation.Δt to be a TimeStepWizard. This is probably an improvement. We can still provide the keywords stop_time and stop_iteration in the Simulation constructor as a convenience; however rather than being properties of Simulation we will create callback objects that get scheduled every iteration and add them to simulation.callbacks.

I think we should also provide a few other features, like to ability to pass Δt to run!, perhaps along with a few other run!-specific objects.

The text was updated successfully, but these errors were encountered:

glwagner · 2020-12-04T13:56:40Z

NaNChecker will also have to be moved from diagnostics to callbacks, see discussion on #1198.

We also may want to make the list of callbacks an OrderedDict, so that we can write something like

simulation.callbacks[:NaNChecker].schedule = IterationInterval(1)

If can overload getproperty for these custom OrderedDict we can also support the syntax

simulation.callbacks.NaNChecker.schedule = IterationInterval(1)

ali-ramadhan · 2020-12-06T18:06:31Z

I'll add a sixth feature (from #1251):

Time step should be updated before simulation callback/progress function so that we print the actual next time step in the simulation callback/progress function. Even better would be to define a new function next_time_step(simulation) since the next time step might be shortened due to alignment.

ali-ramadhan · 2020-12-06T18:17:40Z

I'll also add a seventh feature (from #1250):

Run simulation callback/progress function at iteration 0. Right now the callback/progress function is only called for the first time when iteration = iteration_interval but I think we actually want to run the progress function at iteration 0 as it helps provide more feedback to the user at a time of heavy compilation.

@glwagner also suggested that all the callbacks should be initialized at the beginning of run.

glwagner · 2020-12-06T18:37:03Z

To clarify, there's two possibilities:

When run! begins, execute [initialize!(callback) for callback in simulation.callbacks]
When run! begins, execute the callbacks themselves.

or both.

Which do we want?

There won't be a progress function if this issue is resolved; we'll have a generic list of callbacks that includes default stop criteria as well as user-defined callbacks and stop criteria.

ali-ramadhan · 2020-12-06T18:41:34Z

I was envisioning option 2 (or both if we feel that initialize!(callback) is an important option).

An important practical reason to have option 2 is that some callbacks will only be called very infrequently so it's better to quickly find out that there's a typo or mistake in your callbacks before the simulation runs for 1,000,000 iterations if the callback's schedule is IterationInterval(1_000_000).

glwagner · 2020-12-06T18:45:19Z

That makes sense. We can also let users assume that "iteration = 0" means "initialization". This doesn't cover cases where simulations are run from iterations other than 0 though.

glwagner · 2021-02-16T13:44:40Z

A vaguely related feature that could be addressed in this PR are "update_state! callbacks" that are called at the end of update_state! (rather than called at the end of a time-step on some schedule as simulation.callbacks would be). This could be used to implement hacks like overwriting model.diffusivities, which is a use-case discussed at

#1361

cc @tomchor @ali-ramadhan

francispoulin · 2021-02-16T13:58:04Z

This sounds like good progress here and thank you for doing it.

On a perhaps tangential note, I noticed when I tried the wizard and \Delta t was computed to be 0 I received a warning, but the program kept on running. This created an infinite loop that wouldn't stop. If the time step is 0, or perhaps sufficiently small, I presume we want the simuation to stop. Should this be happening already?

tomchor · 2021-02-16T14:14:11Z

@francispoulin I think you can give a minimum Δt to TimeStepWizard that avoids that. It's zero by default, but you can set something like min_Δt=0.05 (or something else unreasonably small) to achieve what you want.

ali-ramadhan · 2021-02-16T14:16:02Z

@francispoulin Ah the warning about Δt = 0 is probably specific to the IncompressibleModel but is printed in the time stepper so it would show up for other models as well.

I think we agreed in #1255 to avoid calling time_step!(model, 0) since time_step! is not a user-facing function.

Taking this thought further: I guess the responsibility for not taking zero time steps lies with the Simulation and with the user?

francispoulin · 2021-02-16T14:32:05Z

Thanks @ali-ramadhan . There is a ShallowWaterModel specific version of the wizard but it uses a lot that was developed for IncompressibleModel. I guess I should look at that in more detail and see if anything needs to be generalized.

I agree that Simulation should hopefully protect the user from this and I hope that the user would want to avoid this as well.

glwagner · 2021-07-17T14:06:38Z

I'd like to resurrect this issue. We've implemented 5, but we don't have callbacks.

I think we should just add a callback layer to Simulation to replace simulation.progress and address whether diagnostics should become callbacks later.

The key change is that iteration_interval would no longer be an argument to Simulation. Instead we would refactor all the examples and validation tests to implement logging and adaptive time stepping via callbacks. Because of that this ends up being a big API change. A barebones callback feature might be

struct Callback{F, S}
    func :: F
    schedule :: S
end

Usage would be something like

progress(sim) = println("Iteration $(sim.model.clock.iteration)")
progress_printer = Callback(progress, schedule = IterationInterval(100))

wizard = TimeStepWizard(cfl=0.1, initial_dt = 2minutes, schedule = IterationInterval(10))

simulation = Simulation(model, stop_time=2hours, callbacks = [progress_printer, wizard])

In other words, the TimeStepWizard becomes a callback with a schedule, and we can print progress and adapt the time step on different schedules. What do people think about this API?

glwagner · 2022-01-20T15:53:50Z

This was resolved by #1971

This was referenced Nov 4, 2020

Incorrect warning about simulations running forever in Simulation constructor #1124

Closed

Cleaning up Simulation and run! #1095

Closed

glwagner added the abstractions 🎨 Whatever that means label Nov 4, 2020

tomchor mentioned this issue Nov 24, 2020

Kill simulation after a crash #1196

Closed

This was referenced Nov 24, 2020

Add NaN checker to simulations by default #1198

Merged

Align time step with output writing and simulation stop #1213

Merged

glwagner mentioned this issue Dec 2, 2020

Roadmap to version 1.0 #1234

Closed

8 tasks

ali-ramadhan mentioned this issue Dec 3, 2020

Lagrangian particle tracking #1091

Merged

7 tasks

This was referenced Dec 6, 2020

Time step should be updated before simulation callback/progress function #1251

Closed

Run simulation callback/progress function at iteration 0? #1250

Closed

Taking a time step of 0 causes the model to NaN out #1254

Closed

ali-ramadhan mentioned this issue Dec 8, 2020

Tests fail because shallow water model with h=0 blows up when time stepped #1262

Closed

ali-ramadhan mentioned this issue Feb 8, 2021

Convective adjustment and thinking about vertically implicit diffusion #1342

Closed

ali-ramadhan mentioned this issue Feb 16, 2021

Possibly out-of-date docs page #1358

Closed

ali-ramadhan pinned this issue Feb 19, 2021

ali-ramadhan unpinned this issue Feb 19, 2021

glwagner mentioned this issue Sep 9, 2021

New Simulation API: more callbacks, less "progress" #1971

Merged

glwagner closed this as completed Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesigning `Simulation` and `run!` #1138

Redesigning `Simulation` and `run!` #1138

glwagner commented Nov 3, 2020 •

edited

Loading

glwagner commented Dec 4, 2020

ali-ramadhan commented Dec 6, 2020

ali-ramadhan commented Dec 6, 2020

glwagner commented Dec 6, 2020

ali-ramadhan commented Dec 6, 2020

glwagner commented Dec 6, 2020

glwagner commented Feb 16, 2021 •

edited

Loading

francispoulin commented Feb 16, 2021

tomchor commented Feb 16, 2021 •

edited

Loading

ali-ramadhan commented Feb 16, 2021 •

edited

Loading

francispoulin commented Feb 16, 2021

glwagner commented Jul 17, 2021

glwagner commented Jan 20, 2022

Redesigning Simulation and run! #1138

Redesigning Simulation and run! #1138

Comments

glwagner commented Nov 3, 2020 • edited Loading

glwagner commented Dec 4, 2020

ali-ramadhan commented Dec 6, 2020

ali-ramadhan commented Dec 6, 2020

glwagner commented Dec 6, 2020

ali-ramadhan commented Dec 6, 2020

glwagner commented Dec 6, 2020

glwagner commented Feb 16, 2021 • edited Loading

francispoulin commented Feb 16, 2021

tomchor commented Feb 16, 2021 • edited Loading

ali-ramadhan commented Feb 16, 2021 • edited Loading

francispoulin commented Feb 16, 2021

glwagner commented Jul 17, 2021

glwagner commented Jan 20, 2022

Redesigning `Simulation` and `run!` #1138

Redesigning `Simulation` and `run!` #1138

glwagner commented Nov 3, 2020 •

edited

Loading

glwagner commented Feb 16, 2021 •

edited

Loading

tomchor commented Feb 16, 2021 •

edited

Loading

ali-ramadhan commented Feb 16, 2021 •

edited

Loading