Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8 [2/4]: Model fit helpers #18

Merged
merged 41 commits into from
Aug 29, 2024
Merged

Conversation

zsusswein
Copy link
Collaborator

@zsusswein zsusswein commented Jun 12, 2024

Important

This PR a trial for using a stacked PR workflow. Do not merge.

This PR builds on #14 to add two helper functions. Both are necessary prerequisites for the actual model fitting. They suggest a basis dimension for use in the model and construct the actual model formula.

The model formula constructor is meant to be easily extensible in the future to add hierarchy, day of week effects, and more. That isn't in scope for this PR, but if there are extensibility concerns it would be good to flag those now.

It would also be good to get feedback on some of the vibes-based decisions in here, The two big ones are the choice of basis dimension suggestion and the three-week-per-penalty-dimension adaptive smooth. The smoothing basis dimension heuristic a guess and I try to be honest in the documentation about that. I do think it's a decent guess, but if there are real concerns that it's not great I can change things up.

Likewise with the choice of three weeks per additional penalty basis. I think it's an approximately reasonable choice, but there will be tradeoffs. For each, I think it's important that we don't make the perfect the enemy of the good. Better to make some reasonable choices here to get to a v0.1 with a full package so we can do some more robust testing.

@zsusswein
Copy link
Collaborator Author

Opening against main to get codecov on CI, but will target #14 as the base for review.

@zsusswein zsusswein marked this pull request as ready for review June 12, 2024 12:25
Copy link

codecov bot commented Jun 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (5ee9a2c) to head (fda5165).

Additional details and impacted files
@@            Coverage Diff             @@
##              main       #18    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files            3         5     +2     
  Lines           87       208   +121     
==========================================
+ Hits            87       208   +121     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zsusswein zsusswein changed the base branch from main to 08-class-constructor June 12, 2024 12:27
@zsusswein zsusswein requested a review from athowes June 12, 2024 12:43
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks reasonable and I like the majority of the docs a lot.

I'd suggest another pas on the overall docs. I was also a little confused as to why the smoothness penalty was hard coded vs also being treated as user specified?

I think ideally uses should be able to choose between adaptive or non-adaptive and I would just throw a warning if a adaptive spline is used with some small n.

Before merging this PR I would look to see an issue investigating the smoothness defaults for the adaptive spline.

R/RtGam.R Outdated Show resolved Hide resolved
R/RtGam.R Outdated Show resolved Hide resolved
R/RtGam.R Outdated Show resolved Hide resolved
R/RtGam.R Show resolved Hide resolved
R/RtGam.R Outdated Show resolved Hide resolved
R/RtGam.R Show resolved Hide resolved
R/formula.R Outdated Show resolved Hide resolved
@zsusswein
Copy link
Collaborator Author

zsusswein commented Jun 12, 2024

I was also a little confused as to why the smoothness penalty was hard coded vs also being treated as user specified?

This is good feedback. I think for the current functionality, the process is a little convoluted. But, this package is being developed with an eye to implementing day of week effects, partial pooling, etc. When implemented, we'll need to partition the degrees of freedom between these different components -- this process is how I propose to do that. We allow a user-specified global degrees of freedom and handle the partitioning between components internally.

Perhaps we should allow the user to specify individual components' degrees of freedom, but I think it's important we provide a decent first pass there by default.

I think ideally uses should be able to choose between adaptive or non-adaptive and I would just throw a warning if a adaptive spline is used with some small n.

This is good feedback. I'll implement here.

Copy link
Collaborator

@athowes athowes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just getting up to speed with what's going on here so likely not very helpful review comments but in case.

R/RtGam.R Outdated Show resolved Hide resolved
R/RtGam.R Outdated Show resolved Hide resolved
R/RtGam.R Show resolved Hide resolved
@athowes
Copy link
Collaborator

athowes commented Jun 12, 2024

I think ideally uses should be able to choose between adaptive or non-adaptive and I would just throw a warning if a adaptive spline is used with some small n.

I agree that the user should be able to choose between adaptive and non-adaptive. I guess that's not something we have implemented yet (as in so far there is just the one k, and the model isn't fit.)

Just looking, for the group option -- do we want to explicitly link that to "geographic". Is it not also possible to be other possible variables?

When implemented, we'll need to partition the degrees of freedom between these different components -- this process is how I propose to do that. We allow a user-specified global degrees of freedom and handle the partitioning between components internally.

Feels like we might as well have the option for the user to specify the partition too, just the same default argument set-up as we have for the global (i.e. we have a heuristic function, and you can override it if you really want.)

And instead point to `dimensionality_heuristic()` as the sole source
of guidance per @seabbs
To prepare also a user-specific penality basis dimension with associated
heuristic default
This change involves a refactor to the smooth basis dimension selector
name, updates to function signatures, and changes to unit tests.

More usefully, I consolidated the penalty dimension documentation into the
penalty basis dim function, tried to match to doc style of the smooth basis dim
selector, and reduced a fair amount of duplication.
@zsusswein
Copy link
Collaborator Author

zsusswein commented Jun 24, 2024

I'd suggest another pas on the overall docs.

Did some rewriting and condensing

I agree that the user should be able to choose between adaptive and non-adaptive. I guess that's not something we have implemented yet (as in so far there is just the one k, and the model isn't fit.)

I think ideally uses should be able to choose between adaptive or non-adaptive

Done

I would just throw a warning if a adaptive spline is used with some small n.

Will address in the next PR on the model fitting

Before merging this PR I would look to see an issue investigating the smoothness defaults for the adaptive spline.

#19

for the group option -- do we want to explicitly link that to "geographic". Is it not also possible to be other possible variables?

Good catch. Modified it to be more general.

Feels like we might as well have the option for the user to specify the partition too, just the same default argument set-up as we have for the global (i.e. we have a heuristic function, and you can override it if you really want.)

I want to keep this out of scope here, but will revisit once there's some actual partitioning happening (i.e., day of week effect and/or hierarchical model)

@zsusswein zsusswein changed the base branch from 08-class-constructor to main June 24, 2024 13:54
R/formula.R Outdated Show resolved Hide resolved
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. A very few very minor additional comments.

zsusswein and others added 24 commits August 29, 2024 10:24
Allow optional switching between gam() and bam(), both for functionality now
and to illustrate how one might extend to different backends in the future.
It makes documentation slightly more straightforward
As suggested by @seabbs. This allows ... to be evaluated and handles
dispatch.
Has basic info but still needs model fit diagnostics
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
By pointing the public function to the slot with the stored diagnostic
list. Re-order `RtGam()` to use the public function for diagnostic
reporting.
@zsusswein zsusswein merged commit 2179858 into 08-class-constructor Aug 29, 2024
1 of 2 checks passed
@zsusswein zsusswein deleted the 08-model-fit-helpers branch August 29, 2024 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants