-
-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a coalescent time distribution #2474
Conversation
@eb8680 FYI I am adding an interface that plays better with |
I've added a I think it would be cleanest to first merge this PR, then update #2468 to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM per our review over zoom, is there anything else you want me to look at?
R = R0 * prev["S"] / self.population | ||
coal_rate = R * (1. + 1. / k) / (tau_i * prev["I"] + 1e-8) | ||
pyro.factor("coalescent_{}".format(t), | ||
self.coal_likelihood(coal_rate, t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good interface
return log_prob | ||
|
||
|
||
class CoalescentRateLikelihood: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a nn.Module
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment there is no reason to make this an nn.Module
. It is a little more similar to a Distribution
, but doesn't quite follow the distribution interface (indeed I've given it a validate_args
kwarg like a distribution). Moreover in typical usage the parameters are all fixed rather than learned.
BTW this is a very natural object from the perspective of Funsor. Consider a funsor
leaf_times = Tensor(...)
coal_times = Tensor(...)
rate_grid = Variable("rate_grid", reals(duration))
f = CoalescentTimesWithRate(leaf_times, rate_grid).log_prob(coal_times)
Then this likelihood represents roughly the funsor
f(rate_grid=rate_grid["t"])
I look forward to the bright future when we won't need special classes like this 🚀
Thanks for reviewing! |
Addresses #2426
Blocking #2468
This adds two new distributions
CoalescentTimes
andCoalescentTimesWithRate
and a classCoalescentRateLikelihood
implementing a phylogenetic likelihoodby splitting a phylogeny into fixed sample times (i.e. EPI nodes, leaves) and unknown coalescent times (i.e. internal NODE nodes). For
...WithRates
, the base rate is specified as a vector representing a piecewise constant base coalescent rate on a uniform grid (e.g. computed fromS
andI
time series in SIR models).The design aims to make it cheap to enumerate over multiple phylogenetic samples (e.g. from BEAST). Two crucial observations are: (1) among phylogeny samples, the leaves are fixed, hence the total number of nodes is fixed; and (2) we can avoid summing over intervals by precomputing
rate.cumsum(-1)
and using this for all samples. Exploiting these facts leads to anO(T + S N log(N))
algorithm whereT
is the number of simulation time steps,N
is the number of leaves in the phylogeny, andS
is the number of sampled phylogenies. This PR does not implement batching over different phylogenetic problems (e.g. one phylogeny per region in regional models); that would require more careful masking and padding and will be left to a possible future PR.These are low level distribution and do not implement preprocessing to transform phylogenetic trees into
(coal_times, leaf_times)
pairs. This PR is intended as a low-level helper for use in #2468 and an example usingpyro.contrib.epidemiology
.Tested