Execute all notebooks / clear notebook output #1246

Doresic · 2023-12-13T15:29:26Z

All of the example notebooks are stored on GitHub with their output generated which is a lot of figures and text, taking up storage unnecessarily. I've removed the output, but I still need to check the notebook output is being generated for the online docs.

codecov-commenter · 2023-12-13T15:32:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (ebdf99c) 84.31% compared to head (038d1b7) 84.22%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1246      +/-   ##
===========================================
- Coverage    84.31%   84.22%   -0.09%     
===========================================
  Files          151      151              
  Lines        12331    12331              
===========================================
- Hits         10397    10386      -11     
- Misses        1934     1945      +11

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

I reduced the notebook output and run time by tweaking some small things like numbers of starts, rerunning, removing PEtab model compilation output, and added seeds for reproducability. Changes per notebook: - `amici.ipynb` -- TODO, didn't do yet - `censored.ipynb` -- reduce n_starts 10 -> 3 - `getting_started.ipynb` -- n_starts 100 -> 10 (do we need more for a nicer waterfall plot?), added seed for reproducability, Fides verbose=logging.ERROR, create_problem verbose=False (TODO amici.swig_wrappers logging) - `hierarchical.ipynb` -- change to only create_problem with verbose=False, set a seed for reproducability - TODO rest...

I made the following changes in the example notebooks: - `amici.py` -- verbose=False on model compilation, seed for reproducability, clear outputs - `juliy.py` -- set number of starts 100 --> 20, set seed for reproducability, samples 10000 -> 2000, clear outputs - `nonlinear_monotone` -- verbose=False on model compilation - `ordinal` -- verbose=False on model compilation - `petab_import` -- verbose=False on model compilation - `sampler_study` -- verbose=False on model compilation - `store` -- verbose=False on model compilation, fides verbose ERROR - `synthetic_data` -- n_starts 100 -> 20, verbose=False on model compilation - `workflow_comparison` -- verbose=False on model compilation, n_starts 25 - 10

Doresic · 2024-01-03T13:51:36Z

I reduced the notebook output and run time by tweaking some small things like numbers of starts, rerunning, removing PEtab model compilation output, and added seeds for reproducability. Changes per notebook:

amici.py -- verbose=False on model compilation, seed for reproducability, clear outputs
censored.ipynb -- reduce n_starts 10 -> 3
getting_started.ipynb -- n_starts 100 -> 10 (do we need more for a nicer waterfall plot?), added seed for reproducability, Fides verbose=logging.ERROR, create_problem verbose=False (TODO amici.swig_wrappers logging)
hierarchical.ipynb -- change to only create_problem with verbose=False, set a seed for reproducibility
julia.py -- set number of starts 100 --> 20, set seed for reproducability, samples 10000 -> 2000, clear outputs
nonlinear_monotone -- verbose=False on model compilation
ordinal -- verbose=False on model compilation
petab_import -- verbose=False on model compilation
sampler_study -- verbose=False on model compilation
store -- verbose=False on model compilation, fides verbose ERROR
synthetic_data -- n_starts 100 -> 20, verbose=False on model compilation
workflow_comparison -- verbose=False on model compilation, n_starts 25 - 10

Doresic · 2024-01-05T13:22:29Z

The docs test is very slow and ultimately times out at this point:

reading sources... [ 56%] example
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
Warning:  Could not parse YAML metadata at line 5 column 1: not an object
Warning:  Could not parse YAML metadata at line 15 column 1: not an object
reading sources... [ 58%] example/amici
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
reading sources... [ 60%] example/censored
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.

I'm not sure what is the issue with these frozen modules. As much as I could find out it did have something to do with Jupyter and python 3.11. Searching for a solution to speed up the example build...

dweindl · 2024-01-05T13:48:33Z

I'm not sure what is the issue with these frozen modules. As much as I could find out it did have something to do with Jupyter and python 3.11. Searching for a solution to speed up the example build...

Thanks for the cleanup so far.

Don't worry about this message, not relevant here. I silenced it.

Further opportunity for optimization:

getting_started.ipynb:

result = optimize.minimize(problem=problem, 
                           optimizer=optimizer_scipy_lbfgsb,
                           engine=engine,
                           n_starts=100)

... we probably can reduce n_starts there.

same notebook: computing profiles for 3 parameter should be sufficient, we don't need all.

Doresic · 2024-01-05T13:53:43Z

Even when running locally (tox -e doc), reading sources is taking too long. It already starts at the conversion_reaction.ipynb notebook.
I think it might have something to do with this: when I run the notebook alone with python 3.10 it all runs nicely. If I run it with 3.11 the first optimization starts printing a lot of these errors:

So much so, that the whole notebook freezes.

Doresic · 2024-01-05T13:57:33Z

I'm not sure what is the issue with these frozen modules. As much as I could find out it did have something to do with Jupyter and python 3.11. Searching for a solution to speed up the example build...

Thanks for the cleanup so far.

Don't worry about this message, not relevant here. I silenced it.

Further opportunity for optimization:

getting_started.ipynb:
result = optimize.minimize(problem=problem, 
                           optimizer=optimizer_scipy_lbfgsb,
                           engine=engine,
                           n_starts=100)
... we probably can reduce n_starts there.

same notebook: computing profiles for 3 parameter should be sufficient, we don't need all.

If this is the point I think it is, I left it as 100 startpoints so the global optimum found is nice enough to have nice profiles. It doesn't take that long to run (relative to the profiling). But running 3 profiles is definitely better.

dweindl · 2024-01-05T14:02:52Z

a lot of these errors:

see also #1159

dweindl · 2024-01-05T14:17:43Z

To speed things up:

I think we can use a faster model in getting_started, not sure we need anything ODE/PEtab-based in there. we can just reference the other example.
I don't think we need to show profiles and sampling using 5 different types of Objectives. Quite some examples can be reduced. e.g., https://pypesto.readthedocs.io/en/latest/example/conversion_reaction.html seems to be rather redundant and could be removed after moving maybe the ensemble part somewhere else. maybe the construction of the objective can be moved to the amici notebook if not already there. also for the julia notebook, i think it sufficient to show optimization, the rest should be straight-forward.

To not get held up for too long, I'd suggest to pull a couple of changes out of this PR (started: #1268) and merge what's already good/fast enough.

A subset of #1246.

#1278) * Doc: Clear notebook output (fixed parameters, prior definition, store) A subset of #1246. * Really remove output / fix typos * cleanup

Doresic · 2024-01-10T17:29:34Z

There still seems to be too little time for the readthedocs build. It stops at

...
�[2Kreading sources... [ 79%] example/prior_definition
�[2Kreading sources... [ 81%] example/sampler_study
�[2Kreading sources... [ 82%] example/sampling_diagnostics
�[2Kreading sources... [ 84%] example/store

Command killed due to timeout or excessive memory consumption

Will take a look at the rest of the notebooks, specifically the sampling ones, I have a feeling they are taking too long to build.

Doresic · 2024-01-11T09:51:13Z

New changes:

sampling_diagnostics -- n_samples 10000 -> 1000 on both sampling
sampler_study -- n_samples 1e4 -> 1e3 where not necessary to have an extremely good sampling result, add seed for reproducibility (to have a consistently low burn-in to be able to have only 1e3 samples).
custom_objective_function -- using SingleCoreEngine is better with simple problems like these. The overhead of MultiProcessing is not worth it. (7 seconds vs 0.7 seconds) Left a note that in general with more expensive problems, it is recommended to use the MultiProcessEngine
getting_started -- It's still taking a bit too long. Reduced the starts before profiling 100 -> 25, and samples 10000 -> 1000. Results are basically the same. Additionally, added a note on the general amount of starts one would use.
ordinal -- further reduce n_starts from 10 to 3, and use MultiProcessing with 3 procs.
nonlinear_monotone -- further reduce n_starts from 10 to 3, and use MultiProcessing with 3 procs.
censored -- use MultiProcessing with 3 procs.
hierarchical -- small change, set n_procs of MultiProcessing from 6 to 3 as we run only 3 starts.
... Will edit if more changes are made

New changes: - `sampling_diagnostics` -- n_samples 10000 -> 1000 on both sampling - `sampler_study` -- n_samples 1e4 -> 1e3 where not necessary to have an extremely good sampling result, add seed for reproducibility (to have a consistently low burn-in to be able to have only 1e3 samples).

dilpath

👍

stephanmg

Thanks for cleaning this up.

…tor (#1245) * Hier. par. plotting for fides * hierarchical 'hess' bug * Fix storing of inner parameters in result Inner parameters were not being stored when the result was stored. This was due to a problem with putting a dictionary into an HDF5 format. I've transformed it into 2 lists now: a INNER_PARAMETERS_VALUES and INNER_PARAMETER_NAMES list. * adjusted documentation of hierarichal opt. * Fix some type issues * Small fix * Save only values, not names * Extend to all optimizers * Remove redundant * Small update * Update parameters.py * Small typo * Fix some bugs, use lists for saving instead * Fix tests * Update hierarchical parameter plot test * Largescale renaming and restructuring - Restructuring of the hierarchical scaling+offset method. The base classes used by all non-quantitative and semi-quantitative data types are put in the `base_...` files. All classes related to the scaling+offset method (for relative data) is moved to the `relative` folder, analogous to other data types. - Renaming: the naming scheme doesn't follow the method, but the data type now. It's more direct and removes the ambiguity of `OptimalScaling` for ordinal data and `scaling_offset` for relative data. - Optimal scaling -> ordinal - Spline approximation -> semiquantitative - Scaling + offset -> relative * Update base_problem.py * Fix Docstrings * Daniel&Dilan review changes * Small change * More Dilan Review changes * Fix testing * Fix base test * Rename notebooks (again) * Move inner pars to decorator * Collect decorators for minimize * Maren review changes * Rename & fix notebooks * Include notebook changes * FIx tests * Add output * Include relative into Collector Included the relative calculator into the Inner calculator collector. Now relative data can be used together with any other non-quantitative data type. A lot of TODOs to still clean up. But it works nicely on all examples, except one which I think is a bad one (Boehm_mixed_test with nonlinearities: 1 spline 1 relative 1 known: very bad convergence... fits look okayish...) * Small documentation update * Documentation update * Update the notebooks * Update spline_approximation.py * Change data_types from list to set * Use add_sim_grad_to_opt_grad * Rename to semiquantitative in example * mode RES with only RELATIVE data * Add petab data type validation * DO TODOs * Small TODO * FIx api.rst module name changes * Update CODEOWNERS module name changes * Fix underline too short & api.rst * Fix plotting routines * Inline literal error * Testing error * Testing error 2 * Test literal error * Fix missing indentation * Fix inline * Daniel, Fabian & Stephan review changes * Notebook changes * Dilan & Daniel review changes * Change title of relative_data.ipynb * Remove observableParameter from solvers * Notebook changes from #1246 * Fabian review change --------- Co-authored-by: Lea@Mac <leaseep@gmail.com>

dweindl

👍

Clear output of all notebooks

203d334

dweindl added 11 commits December 15, 2023 10:39

Merge branch 'develop' into remove_example_notebook_output

171145b

rtd

4847c3b

..

f8f3858

..

905281e

..

787ed8d

..

151ad1a

..

ee68704

..

64f4216

..

98568f0

..

7e9aa6c

jl

12b242b

dweindl changed the title ~~Clear output of all notebooks (Drafting)~~ Execute all notebooks / clear notebook output (Drafting) Dec 15, 2023

dweindl and others added 4 commits December 15, 2023 14:50

exclude julia example for now

89e975f

Merge branch 'develop' into remove_example_notebook_output

e748822

Doresic self-assigned this Jan 3, 2024

Merge branch 'develop' into remove_example_notebook_output

09f33db

quiet!

8bdb1ad

py310

d1fd250

3 profiles getting_started

49b3e54

dweindl added a commit that referenced this pull request Jan 10, 2024

Doc: Clear getting-started notebook output (#1276)

a946b89

A subset of #1246.

Merge branch 'develop' into remove_example_notebook_output

9886426

dweindl added 2 commits January 10, 2024 12:50

..

402d687

Merge branch 'develop' into remove_example_notebook_output

c36ca7a

dweindl changed the title ~~Execute all notebooks / clear notebook output (Drafting)~~ Execute all notebooks / clear notebook output Jan 10, 2024

Doresic added 2 commits January 10, 2024 16:30

..

7693712

Exclude julia.ipynb for now

f533b95

Doresic added 2 commits January 11, 2024 10:52

Some further changes

b3d8e6d

Doresic marked this pull request as ready for review January 11, 2024 11:55

Doresic requested review from stephanmg, m-philipps, dilpath, dweindl and a team as code owners January 11, 2024 11:55

Doresic added a commit that referenced this pull request Jan 11, 2024

Notebook changes from #1246

909aae7

dilpath approved these changes Jan 11, 2024

View reviewed changes

stephanmg approved these changes Jan 11, 2024

View reviewed changes

Doresic and others added 2 commits January 12, 2024 10:22

Merge branch 'develop' into remove_example_notebook_output

81463e1

Fix formatting

038d1b7

dweindl approved these changes Jan 15, 2024

View reviewed changes

FFroehlich approved these changes Jan 15, 2024

View reviewed changes

Doresic merged commit 1c5580d into develop Jan 15, 2024
18 checks passed

Doresic deleted the remove_example_notebook_output branch January 15, 2024 10:18

This was referenced Jan 30, 2024

Release v0.4.2 #1298

Merged

Prepare release v0.4.2 #1299

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Execute all notebooks / clear notebook output #1246

Execute all notebooks / clear notebook output #1246

Doresic commented Dec 13, 2023

codecov-commenter commented Dec 13, 2023 •

edited

Loading

Doresic commented Jan 3, 2024

Doresic commented Jan 5, 2024 •

edited

Loading

dweindl commented Jan 5, 2024 •

edited

Loading

Doresic commented Jan 5, 2024

Doresic commented Jan 5, 2024

dweindl commented Jan 5, 2024

dweindl commented Jan 5, 2024 •

edited

Loading

Doresic commented Jan 10, 2024 •

edited

Loading

Doresic commented Jan 11, 2024 •

edited

Loading

dilpath left a comment

stephanmg left a comment

dweindl left a comment

Execute all notebooks / clear notebook output #1246

Execute all notebooks / clear notebook output #1246

Conversation

Doresic commented Dec 13, 2023

codecov-commenter commented Dec 13, 2023 • edited Loading

Codecov Report

Doresic commented Jan 3, 2024

Doresic commented Jan 5, 2024 • edited Loading

dweindl commented Jan 5, 2024 • edited Loading

Doresic commented Jan 5, 2024

Doresic commented Jan 5, 2024

dweindl commented Jan 5, 2024

dweindl commented Jan 5, 2024 • edited Loading

Doresic commented Jan 10, 2024 • edited Loading

Doresic commented Jan 11, 2024 • edited Loading

dilpath left a comment

Choose a reason for hiding this comment

stephanmg left a comment

Choose a reason for hiding this comment

dweindl left a comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 13, 2023 •

edited

Loading

Doresic commented Jan 5, 2024 •

edited

Loading

dweindl commented Jan 5, 2024 •

edited

Loading

dweindl commented Jan 5, 2024 •

edited

Loading

Doresic commented Jan 10, 2024 •

edited

Loading

Doresic commented Jan 11, 2024 •

edited

Loading