Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesigning Simulation and run! #1138

Closed
glwagner opened this issue Nov 3, 2020 · 13 comments
Closed

Redesigning Simulation and run! #1138

glwagner opened this issue Nov 3, 2020 · 13 comments
Labels
abstractions 🎨 Whatever that means

Comments

@glwagner
Copy link
Member

glwagner commented Nov 3, 2020

@ali-ramadhan and I have discussed a potential redesign of Simulation and run! that has several key features:

  1. Coalesce all of the "callbacks" (arbitrary functions that are executed during a time-stepping loop) other than OutputWriters into a single list. Current objects that we can classify / redesign as callbacks are: stop criteria, TimeStepWizard, and diagnostics. All callbacks are required to possess callback.schedule, and we can provide convenience objects for coupling simple callback functions to AbstractSchedule.

  2. Within the time-stepping loop, execute simulation.callbacks prior to writing output. This ensures that data calculated in a callback can be output during the same time-step, such as WindowedTimeAverage and other non-local-in-time output.

  3. Wrap the time-stepping loop inside a try / catch block and throw exceptions to stop a simulation. This generalizes the concept of stopping a simulation and also means that a simulation can be stopped inside any callback. Further, when an AbstractStopException is called we will loop over the OutputWriter callbacks a final time, passing the exception into the OutputWriter callback functions. This allows output behavior specialized on the type of exception. For example:

  • If NaNsDetected is thrown, no output will be written.
  • If WallTimeExceeded is thrown, the checkpointer may write output.
  1. simulation.Δt becomes a number corresponding to the next time-step, always (rather than sometimes being a TimeStepWizard). The TimeStepWizard callback changes this number on its schedule. Otherwise, the time-step is held constant. This changes the API, since the initial time-step must now be provided to Simulation.

  2. (Somewhat unrelated, but enabled by the new pattern) Use a new function align!(simulation.Δt, writer.schedule, simulation.model) to adjust a subsequent time-step if output writing is scheduled. This ensures output writing on TimeIntervals will always be on schedule rather than chronically late as it is now. Since output is called after all the other callbacks, the output writers get the final say as to the next time-step.

These changes will break the existing API. Notably, we'll use

push!(simulation.callbacks, TimeStepWizard(cfl=1))

rather than setting simulation.Δt to be a TimeStepWizard. This is probably an improvement. We can still provide the keywords stop_time and stop_iteration in the Simulation constructor as a convenience; however rather than being properties of Simulation we will create callback objects that get scheduled every iteration and add them to simulation.callbacks.

I think we should also provide a few other features, like to ability to pass Δt to run!, perhaps along with a few other run!-specific objects.

@glwagner
Copy link
Member Author

glwagner commented Dec 4, 2020

NaNChecker will also have to be moved from diagnostics to callbacks, see discussion on #1198.

We also may want to make the list of callbacks an OrderedDict, so that we can write something like

simulation.callbacks[:NaNChecker].schedule = IterationInterval(1)

If can overload getproperty for these custom OrderedDict we can also support the syntax

simulation.callbacks.NaNChecker.schedule = IterationInterval(1)

@ali-ramadhan
Copy link
Member

I'll add a sixth feature (from #1251):

  1. Time step should be updated before simulation callback/progress function so that we print the actual next time step in the simulation callback/progress function. Even better would be to define a new function next_time_step(simulation) since the next time step might be shortened due to alignment.

@ali-ramadhan
Copy link
Member

I'll also add a seventh feature (from #1250):

  1. Run simulation callback/progress function at iteration 0. Right now the callback/progress function is only called for the first time when iteration = iteration_interval but I think we actually want to run the progress function at iteration 0 as it helps provide more feedback to the user at a time of heavy compilation.

@glwagner also suggested that all the callbacks should be initialized at the beginning of run.

@glwagner
Copy link
Member Author

glwagner commented Dec 6, 2020

To clarify, there's two possibilities:

  1. When run! begins, execute [initialize!(callback) for callback in simulation.callbacks]
  2. When run! begins, execute the callbacks themselves.

or both.

Which do we want?

There won't be a progress function if this issue is resolved; we'll have a generic list of callbacks that includes default stop criteria as well as user-defined callbacks and stop criteria.

@ali-ramadhan
Copy link
Member

I was envisioning option 2 (or both if we feel that initialize!(callback) is an important option).

An important practical reason to have option 2 is that some callbacks will only be called very infrequently so it's better to quickly find out that there's a typo or mistake in your callbacks before the simulation runs for 1,000,000 iterations if the callback's schedule is IterationInterval(1_000_000).

@glwagner
Copy link
Member Author

glwagner commented Dec 6, 2020

That makes sense. We can also let users assume that "iteration = 0" means "initialization". This doesn't cover cases where simulations are run from iterations other than 0 though.

@glwagner
Copy link
Member Author

glwagner commented Feb 16, 2021

A vaguely related feature that could be addressed in this PR are "update_state! callbacks" that are called at the end of update_state! (rather than called at the end of a time-step on some schedule as simulation.callbacks would be). This could be used to implement hacks like overwriting model.diffusivities, which is a use-case discussed at

#1361

cc @tomchor @ali-ramadhan

@francispoulin
Copy link
Collaborator

This sounds like good progress here and thank you for doing it.

On a perhaps tangential note, I noticed when I tried the wizard and \Delta t was computed to be 0 I received a warning, but the program kept on running. This created an infinite loop that wouldn't stop. If the time step is 0, or perhaps sufficiently small, I presume we want the simuation to stop. Should this be happening already?

@tomchor
Copy link
Collaborator

tomchor commented Feb 16, 2021

@francispoulin I think you can give a minimum Δt to TimeStepWizard that avoids that. It's zero by default, but you can set something like min_Δt=0.05 (or something else unreasonably small) to achieve what you want.

@ali-ramadhan
Copy link
Member

ali-ramadhan commented Feb 16, 2021

@francispoulin Ah the warning about Δt = 0 is probably specific to the IncompressibleModel but is printed in the time stepper so it would show up for other models as well.

I think we agreed in #1255 to avoid calling time_step!(model, 0) since time_step! is not a user-facing function.

Taking this thought further: I guess the responsibility for not taking zero time steps lies with the Simulation and with the user?

@francispoulin
Copy link
Collaborator

Thanks @ali-ramadhan . There is a ShallowWaterModel specific version of the wizard but it uses a lot that was developed for IncompressibleModel. I guess I should look at that in more detail and see if anything needs to be generalized.

I agree that Simulation should hopefully protect the user from this and I hope that the user would want to avoid this as well.

@ali-ramadhan ali-ramadhan pinned this issue Feb 19, 2021
@ali-ramadhan ali-ramadhan unpinned this issue Feb 19, 2021
@glwagner
Copy link
Member Author

I'd like to resurrect this issue. We've implemented 5, but we don't have callbacks.

I think we should just add a callback layer to Simulation to replace simulation.progress and address whether diagnostics should become callbacks later.

The key change is that iteration_interval would no longer be an argument to Simulation. Instead we would refactor all the examples and validation tests to implement logging and adaptive time stepping via callbacks. Because of that this ends up being a big API change. A barebones callback feature might be

struct Callback{F, S}
    func :: F
    schedule :: S
end

Usage would be something like

progress(sim) = println("Iteration $(sim.model.clock.iteration)")
progress_printer = Callback(progress, schedule = IterationInterval(100))

wizard = TimeStepWizard(cfl=0.1, initial_dt = 2minutes, schedule = IterationInterval(10))

simulation = Simulation(model, stop_time=2hours, callbacks = [progress_printer, wizard])

In other words, the TimeStepWizard becomes a callback with a schedule, and we can print progress and adapt the time step on different schedules. What do people think about this API?

@glwagner
Copy link
Member Author

This was resolved by #1971

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
abstractions 🎨 Whatever that means
Projects
None yet
Development

No branches or pull requests

4 participants