Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate get_incubation_period and get_generation_time due to reimplementation elsewhere #390

Closed
Tracked by #423
sbfnk opened this issue May 15, 2023 · 8 comments · Fixed by #481
Closed
Tracked by #423

Comments

@sbfnk
Copy link
Contributor

sbfnk commented May 15, 2023

The idea of having a parameter database didn't end up implemented here (and is now implemented in epiparameter and taken on further by the WHO). The corresponding functions can be removed and the default values replaced with examples.

@seabbs
Copy link
Contributor

seabbs commented May 15, 2023

The idea of having a parameter database didn't end up implemented here

Rather it was implemented but not taken forward/supported here due to lack of resources.

For more detail epiparameter currently implements a near like for like swap in with functionality implemented here in terms of database lookup but can't actually be made use of as it stands as it doesn't (?) support uncertainty. Even if it did there doesn't seem to be much benefit introducing a dependency on a database when we can make use of a simple built-in example dataset. Users can then freely switch up their data sources without being funnelled into a set ecosystem.

If wanting to depreciate these parts of the package to streamline than a sensible option is to instead provide the current built-ins as example datasets.

This could be sensible as due to lack of resourcing and above similar initiatives starting it is unlikely the database in package idea will be taken forward here and so the current estimates will only get more out of date.

At the same time as making the incubation period and generation time estimates built-in examples it would be sensible to make a similar example for the reporting delay.

Note that this change will break a large number of package implementations and so needs to be carefully broadcast. We should more than likely use lifecycle to broadcast this change for at least a few minor version or one major version.

@seabbs seabbs changed the title deprecate get_incubation_period and get_generation_time deprecate get_incubation_period and get_generation_time due to reimplementation elsewhere May 15, 2023
@seabbs seabbs changed the title deprecate get_incubation_period and get_generation_time due to reimplementation elsewhere Deprecate get_incubation_period and get_generation_time due to reimplementation elsewhere May 15, 2023
@sbfnk
Copy link
Contributor Author

sbfnk commented May 16, 2023

can't actually be made use of as it stands as it doesn't (?) support uncertainty

I think it does via credible intervals (which could be converted to an standard deviation assuming a normal as used here)

@sbfnk sbfnk mentioned this issue Jul 18, 2023
18 tasks
@sbfnk
Copy link
Contributor Author

sbfnk commented Jul 22, 2023

If wanting to depreciate these parts of the package to streamline than a sensible option is to instead provide the current built-ins as example datasets.

Another option would be to demonstrate the workflow a bit better from
literature (/database) parameter estimates -> constructing delays using dist_spec -> arguments to estimate_infections. This would potentially be more instructive for future users and scenarios and could be done in the vignette detailing options for estimate_infections proposed in #430 showing some of the options available without introducing additional dependencies.

@seabbs
Copy link
Contributor

seabbs commented Jul 24, 2023

Another option would be to demonstrate the workflow a bit better from
literature (/database) parameter estimates -> constructing delays

This would introduce another suggest/import throughout the package as we need something use in examples and lock the users into a workflow I am not sure we have enough information about to make a judgement call on.

I would rather keep the workflow simple and clearly be example based. Once the landscape has matured we could have a vignette bringing together resources we like in order to signpost to users - its not clear to me what those are at the moment.

@sbfnk
Copy link
Contributor Author

sbfnk commented Jul 24, 2023

I don't think there's a need to introduce another suggest/import (and agree they should be considered very carefully). What I meant was that instead of providing examples (i.e. here's an example distribution) it might be more useful (or perhaps complementary) to explain the workflow a bit better, where one really important part is the question of where I get my parameter estimates from (and what I need). Options being: literature, with caveats on how they're estimated / discretisation issues etc.; estimate ourselves if raw data is available (using e.g. internal methodology or dynamicaltruncation), etc.

@seabbs
Copy link
Contributor

seabbs commented Jul 24, 2023

I don't think your suggestion will work when it comes to function examples, unit tests etc. Unless we want to manually specify "examples" in each which seems error-prone on the developer side.

I think we do need to explain the workflow better but I am not that happy we have the tools to do so at the moment which is where the recent work on delay distributions comes in or the resources to do it well enough to be useful. I think this should be moved to another issue IMO as a vignette with this one being focussed on dropping the current get_ functions and the pros and cons of doing so. Moving your nowcasting example into the package as a vignette would go a long way to being helpful here or alternatively the simple version I wrote recently for the CFA.

@sbfnk
Copy link
Contributor Author

sbfnk commented Jul 24, 2023

I don't think your suggestion will work when it comes to function examples, unit tests etc. Unless we want to manually specify "examples" in each which seems error-prone on the developer side.

This is where I'm unsure. In deprecating the get_ functions is it better to give users examples in the form of

generation_time <- example_covid19_generation_time

or

generation_time <- dist_spec(mean = 5, mean_sd = 1, sd = 2, sd_sd = 1, max = 15)

The first is simpler but possibly the second is more like what a user would do (in a workflow where the first step is to investigate appropriate parameters). I can see a case for both but don't have a fully formed opinion which one should be preferred.

@seabbs
Copy link
Contributor

seabbs commented Jul 25, 2023

I like the latter aside from the fact we need to manually specify it everywhere and that is pretty error-prone and so should be avoided.

The former is very common practice. We can put in some really strong warnings to people that it is just an example in the docs and point at the Epi CRAN task review for resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants