In this project, we want to combine methods from Statistical Physics and Bayesian Data Analysis to elucidate the principles behind cellular growth and division. We will study various classes of individual-based growth-division models and infer individal-level processes (model structures and likely ranges of associated parameters) from sigle-cell observations. In the Bayesian framework, we formalize our process understanding the form of different rate functions, expressing the dependence of growth and division rates on variables characterizing a cell’s state (such as size and protein content), and calculate the Bayesian posteriors for the parameters of these functions.
The notebook Microbial_Scaling_Laws.ipynb is the root file of the repository and includes:
- the theoretical results, both general and specific of the single models;
- the explanation of the methods used and the description of the workflow followed;
- the import and the 3-dimensional plot of the data used in the analysis;
- the hyperlinks to the notebooks of the individual models;
- the general results of all models.
The repository is then divided into 5 folders:
- analysis_real_data
- analysis_sim_data
- data
- images
- real_data_alternative_way, where we use the means and stds as parameters for the Gamma and Beta distributions instead of a,b,c,d
In the folder analysis_real_data it is possible to find a Python package, containing the functions used in the analysis of real data: Fernando_package.
In our models we consider the evolution of a single non-interacting cell, which undergoes 2 processes:
-
growth: the cell size
$x(t)$ evolves according to the following equation
In some cases this relation can be expressed in vectorial form, where
-
division: it is ruled by the hazard rate function
$h(x(t))$ , which represents the istantaneous probability of the cell to divide. This function is related to the so called survival function$s(t)$ , by the relation
where
While the growth is a deterministic process, division is a stochastic event. Since division does not always divide the cell into two equal parts, we introduce a parameter
Finally, we assume that the division ratios
Notebook: Model 0
We start with a very simple stochastic model, biologically not very realistic, but useful to start familiarizing with the problem. In this first model we define
where
Notebook: Model 1
As in the previous model, even in this case the cell growth is governed by a single trait, which is the size. However, this model is biologically more realistic, mainly because a lower bound is placed on the size of the cell such that it can divide.
Also in this case the processes considered are growth and division, governed by
where
Notebook: Model 2
The main difference between this model and the previous ones is that here we consider 2 traits: the cell size
As before, the traits evolution and the cell division are governed by
From
Moreover, in this model the condition under which the cell can divide is that it contains a minimum amount of a specific type of protein, which we call
The initial conditions for
and the division process occurs in the following way
where
For all models, the set of parameters to be inferred is
Applying the Bayes theorem, we can write
where
Regarding the likelihood,
where the last 2 are respectively the
Workflow
-
Calibration:
Performing Markov Chain Monte Carlo (MCMC), via the Python implementation emcee, we find the posterior distribution of$\theta$ and the marginalized posterior of each parameter, of which we calculate the maximum, the median and the 95% credibility interval. Then, we use this results to generate a simulated time series, that can be compared with the real data, to find which model is statistically better. -
Validation:
Model validation and comparison is achieved by- making a boxplot of the simulated and real interdivision times
- computing the overlap of the histograms of the interdivision times
- calculating the predictive density
[1] Held J, Lorimer T, Pomati F, Stoop R, Albert C. Second-order phase transition in phytoplankton trait dynamics. Chaos. 2020; 30(5):053109. https://doi.org/10.1063/1.5141755
[2] Zheng, H., Bai, Y., Jiang, M. et al. General quantitative relations linking cell growth and the cell cycle in Escherichia coli. Nature Microbiology. 2020; 5(8):995–1001. https://doi.org/10.1038/s41564-020-0717-x
[3] emcee documentation: https://emcee.readthedocs.io/en/stable/