Implement a coalescent time distribution #2474

fritzo · 2020-05-10T23:36:34Z

Addresses #2426
Blocking #2468

This adds two new distributions CoalescentTimes and CoalescentTimesWithRate and a class CoalescentRateLikelihood implementing a phylogenetic likelihood

p(coalescent times | sample times, base rate)

by splitting a phylogeny into fixed sample times (i.e. EPI nodes, leaves) and unknown coalescent times (i.e. internal NODE nodes). For ...WithRates, the base rate is specified as a vector representing a piecewise constant base coalescent rate on a uniform grid (e.g. computed from S and I time series in SIR models).

The design aims to make it cheap to enumerate over multiple phylogenetic samples (e.g. from BEAST). Two crucial observations are: (1) among phylogeny samples, the leaves are fixed, hence the total number of nodes is fixed; and (2) we can avoid summing over intervals by precomputing rate.cumsum(-1) and using this for all samples. Exploiting these facts leads to an O(T + S N log(N)) algorithm where T is the number of simulation time steps, N is the number of leaves in the phylogeny, and S is the number of sampled phylogenies. This PR does not implement batching over different phylogenetic problems (e.g. one phylogeny per region in regional models); that would require more careful masking and padding and will be left to a possible future PR.

These are low level distribution and do not implement preprocessing to transform phylogenetic trees into (coal_times, leaf_times) pairs. This PR is intended as a low-level helper for use in #2468 and an example using pyro.contrib.epidemiology.

Tested

shape tests
compare against uniform rate
added smoke tests for use in an SEIR model
visually sanity check samples in a notebook

pyro/distributions/coalescent.py

fritzo · 2020-05-13T15:37:22Z

@eb8680 FYI I am adding an interface that plays better with .transition_bwd() as we discussed.

fritzo · 2020-05-14T04:18:46Z

I've added a CoalescentRateLikelihood class compatible with both vectorized and sequential operation, and copied more of @eb8680's #2468 into this PR to exercise the likelihood in an SEIR model. However this PR still deals entirely with (leaf_times, coal_times).

I think it would be cleanest to first merge this PR, then update #2468 to use CoalescentRateLikelihood and update the example script there to parse a .nwk phylogeny and extract to (leaf_times, coal_times) either using @eb8680's r2py code or the Bio.Phylo python library.

eb8680

LGTM per our review over zoom, is there anything else you want me to look at?

eb8680 · 2020-05-14T21:26:20Z

pyro/contrib/epidemiology/seir.py

+            R = R0 * prev["S"] / self.population
+            coal_rate = R * (1. + 1. / k) / (tau_i * prev["I"] + 1e-8)
+            pyro.factor("coalescent_{}".format(t),
+                        self.coal_likelihood(coal_rate, t))


This seems like a good interface

eb8680 · 2020-05-14T21:27:57Z

pyro/distributions/coalescent.py

+        return log_prob
+
+
+class CoalescentRateLikelihood:


Should this be a nn.Module?

At the moment there is no reason to make this an nn.Module. It is a little more similar to a Distribution, but doesn't quite follow the distribution interface (indeed I've given it a validate_args kwarg like a distribution). Moreover in typical usage the parameters are all fixed rather than learned.

BTW this is a very natural object from the perspective of Funsor. Consider a funsor

leaf_times = Tensor(...) coal_times = Tensor(...) rate_grid = Variable("rate_grid", reals(duration)) f = CoalescentTimesWithRate(leaf_times, rate_grid).log_prob(coal_times)

Then this likelihood represents roughly the funsor

f(rate_grid=rate_grid["t"])

I look forward to the bright future when we won't need special classes like this 🚀

fritzo · 2020-05-14T22:19:57Z

Thanks for reviewing!

fritzo added 6 commits May 9, 2020 17:55

Sketch Kingman coalescent distributions

b915950

Sketch CoalescentTimes distribution with .log_prob()

2a1ec22

Work more on CoalescentTimesWithRate

d4f427a

Merge branch 'dev' into coalescent

980df7a

Get shape tests to pass

75089d9

Add docs

00ac74d

fritzo added the WIP label May 10, 2020

fritzo added 5 commits May 11, 2020 18:04

Merge branch 'dev' into coalescent

51979fa

Add a simple CoalescentTimes distribution

f9a6ac4

Fix shape bugs

6feb576

Add more tests

9f82eee

Simplify

89bb100

fritzo added awaiting review and removed WIP labels May 12, 2020

fritzo requested a review from eb8680 May 12, 2020 17:20

fritzo commented May 12, 2020

View reviewed changes

pyro/distributions/coalescent.py Show resolved Hide resolved

fritzo added WIP and removed awaiting review labels May 12, 2020

fritzo force-pushed the coalescent branch from 4986f82 to 89bb100 Compare May 12, 2020 19:45

Speed up sampling; fix PyTorch 1.4 bug

26906ea

fritzo added awaiting review and removed WIP labels May 12, 2020

Break symmetry by enforcing order

1f34803

fritzo mentioned this pull request May 13, 2020

FR Discrete compartmental models for epidemiology #2426

Closed

38 tasks

flake8

fb6041d

fritzo added WIP awaiting review and removed awaiting review WIP labels May 13, 2020

fritzo added the WIP label May 13, 2020

fritzo added 3 commits May 13, 2020 19:57

Add coalescent likelihood function

479a863

Refactor to a class CoalescentRateLikelihood

22aaae4

Add CoalescentRateLikelihood to an SEIR model

6c1a24a

fritzo added awaiting review and removed WIP labels May 14, 2020

fritzo mentioned this pull request May 14, 2020

Add coalescent likelihoods to contrib.epidemiology #2468

Closed

13 tasks

fritzo requested a review from martinjankowiak May 14, 2020 19:07

eb8680 approved these changes May 14, 2020

View reviewed changes

fritzo removed the request for review from martinjankowiak May 14, 2020 22:20

eb8680 merged commit d293cba into dev May 14, 2020

fritzo deleted the coalescent branch June 5, 2020 15:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a coalescent time distribution #2474

Implement a coalescent time distribution #2474

fritzo commented May 10, 2020 •

edited

Loading

fritzo commented May 13, 2020

fritzo commented May 14, 2020 •

edited

Loading

eb8680 left a comment

eb8680 May 14, 2020

eb8680 May 14, 2020

fritzo May 14, 2020

fritzo commented May 14, 2020

Implement a coalescent time distribution #2474

Implement a coalescent time distribution #2474

Conversation

fritzo commented May 10, 2020 • edited Loading

Tested

fritzo commented May 13, 2020

fritzo commented May 14, 2020 • edited Loading

eb8680 left a comment

Choose a reason for hiding this comment

eb8680 May 14, 2020

Choose a reason for hiding this comment

eb8680 May 14, 2020

Choose a reason for hiding this comment

fritzo May 14, 2020

Choose a reason for hiding this comment

fritzo commented May 14, 2020

fritzo commented May 10, 2020 •

edited

Loading

fritzo commented May 14, 2020 •

edited

Loading