Optimization #92

stijnvanhoey · 2020-06-08T21:29:28Z

I have checked to ensure there aren't other open Pull Requests for the same update/change
I made sure all new and existing data sets have a description in the data/README.md file
I have updated the documentation accordingly.

This PR is a blueprint and first attempt to provide an optimization functionality taking into account the extra_time or lagtime concept requested by @twallema:

it relies on xarray to define the data to fit against. Currently just a single array, but this makes it extendable to more measured variables at the same time.
I renamed the current mcmc.py file to optimize, as this has not yet any mcmc functionality as functions
I'm not completely satisfied about the model-pars versus other par (lag_time), but it is a first attempt to make the same sse function useable with fixed and optimized lag_time.

See the example notebook 0.2-stijnvanhoey-refactor-optimization.ipynb for an example of the usage, which I tried to keep close to the current fit_pso function.

@twallema we should NOT have both sse and SSE there and neither remain with fit and fit_pso but as current test and work in progress, I kept them both for the time being.

Note I only worked on the sse optimiziation as such and not on the MCMC part

stijnvanhoey · 2020-06-08T21:31:51Z

@twallema feel free to test it out (maybe checkout to my branch to try);

@jorisvandenbossche, there are still a lot of shortcuts and hardcoded elements, which requires further abstraction, so any input is welcome. We could actually make the fit a model method as well...

jorisvandenbossche · 2020-06-15T08:22:06Z

src/covid19model/data/parameters.py

@@ -54,27 +54,30 @@ def get_COVID19_SEIRD_parameters():
    -----------
    parameters = get_COVID19_SEIRD_parameters()
    """
+    abs_dir = os.path.dirname(__file__)
+    rel_dir = os.path.join(abs_dir, "../../../data/raw/model_parameters")


Can you do a merge of master, as I think this already changed

Yes, I did update this morning and merged with the adjustments Jenna did earlier.

jorisvandenbossche · 2020-06-15T08:26:00Z

src/covid19model/optimization/objective_fcns.py

+
+    # run model
+    time = len(data.time) + lag_time # at least this length
+    output = model.sim(time)


Could also run with [-lag_time, time] ? (or does the model not work with negative time?)

The solve_ivp from SciPy requires the user to define the times at which the model output should be returned, they can be negative without any problem. So assigning the time as [ -lag_time, time] in the model output xarray, and assigning the time as [0 1 2 ... 7] in the data xarray should allow for pairing without using values as suggested below. However, this does not really abstract the lag_time. I'm also wondering if this approach will work if the times do not match, f.i. when using a discrete-time model with step-length 0.235, some form of extrapolation will be necessary to match outputs. I've seen xarray can be linked to SciPy interpolate: see http://xarray.pydata.org/en/stable/generated/xarray.DataArray.interp.html

The default will be to only align those values where the time values match exactly, but you can indeed make this more flexible (eg http://xarray.pydata.org/en/stable/generated/xarray.DataArray.reindex.html also allows a simple "nearest" or a certain tolerance that should be allowed)

jorisvandenbossche · 2020-06-15T08:29:16Z

src/covid19model/optimization/objective_fcns.py

+
+    # define new parameters in model
+    for i, parameter in enumerate(model_parameters):
+        model.parameters.update({parameter: theta[i]})


I am wondering if we should introduce a concept of "cloning" a model, to avoid changing the original one when setting parameters like this for an optimization run

I propose to define this in a separate issue/PR to provide a more generic solution.

jorisvandenbossche · 2020-06-15T08:31:40Z

src/covid19model/optimization/objective_fcns.py

+    output_to_compare = subset_output.sel(time=slice(lag_time, lag_time - 1 + len(data)))
+
+    # calculate sse -> we could maybe enable more function options on this level?
+    sse = np.sum((data.values - output_to_compare.values)**2)


Do we need to check that the "time" actually matched? (with the provided data) So if you adjust the time of the model output with lag_time, this calculation could be done with the xarray object (so without .values), and then this alignment on the time dimension will be done automatically

stijnvanhoey added 6 commits June 8, 2020 22:45

Add refactored sse function

e8224d8

Switch name and add refactored fit function

7d6fde2

Fix path handling data functions

a751680

Fix object names

de6b74c

Add example of refactored optimization

b6866f9

Fix conflict

542f100

stijnvanhoey marked this pull request as draft June 8, 2020 21:29

stijnvanhoey mentioned this pull request Jun 8, 2020

function get_COVID19_SEIRD_parameters not working due to missing file #91

Closed

Fix rel path conflicts

f17c0c6

jorisvandenbossche reviewed Jun 15, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization #92

Optimization #92

stijnvanhoey commented Jun 8, 2020

stijnvanhoey commented Jun 8, 2020

jorisvandenbossche Jun 15, 2020

stijnvanhoey Jun 15, 2020

jorisvandenbossche Jun 15, 2020

twallema Jun 15, 2020 •

edited

Loading

jorisvandenbossche Jun 15, 2020

jorisvandenbossche Jun 15, 2020

stijnvanhoey Jun 15, 2020

jorisvandenbossche Jun 15, 2020

Optimization #92

Are you sure you want to change the base?

Optimization #92

Conversation

stijnvanhoey commented Jun 8, 2020

stijnvanhoey commented Jun 8, 2020

jorisvandenbossche Jun 15, 2020

Choose a reason for hiding this comment

stijnvanhoey Jun 15, 2020

Choose a reason for hiding this comment

jorisvandenbossche Jun 15, 2020

Choose a reason for hiding this comment

twallema Jun 15, 2020 • edited Loading

Choose a reason for hiding this comment

jorisvandenbossche Jun 15, 2020

Choose a reason for hiding this comment

jorisvandenbossche Jun 15, 2020

Choose a reason for hiding this comment

stijnvanhoey Jun 15, 2020

Choose a reason for hiding this comment

jorisvandenbossche Jun 15, 2020

Choose a reason for hiding this comment

twallema Jun 15, 2020 •

edited

Loading