PhylogicNDT

Installation

First: Clone this repository

git clone https://github.com/broadinstitute/PhylogicNDT.git
cd PhylogicNDT

Then either :

Manual Install

Install python 2.7, R (optional) and required packages For debian:

apt-get install python-pip build-essential python-dev r-base r-base-dev git graphviz libgraphviz-dev

Install setuptools and wheel

pip install setuptools wheel

Install required packages

pip install -r req

Install scipy, matplotlib, and pandas (these versions are recommended)

pip install pandas==0.19.2 scipy==1.0.0 matplotlib==2.0.0
pip install -e git+https://github.com/rmcgibbo/logsumexp.git#egg=sselogsumexp (for faster compute)

Docker Install

Install docker from https://www.docker.com/community-edition#/download

docker build --tag phylogicndt .

Using the Package

./PhylogicNDT.py --help

If running from the docker, first run:

docker run -i -t phylogicndt
cd phylogicndt

Clustering

To run clustering on the provided sample input data:

To specify inputs:

./PhylogicNDT.py Cluster -i Patient_ID  -s Sample1_id:Sample1_maf:Sample1_CN_seg:Sample1_Purity:Sample1_Timepoint -s Sample2_id:Sample2_maf:Sample2_CN_seg:Sample2_Purity:Sample2_Timepoint ... SampleN_info

alternatively - provide a tsv sample_information_file (.sif)

with headers: sample_id maf_fn seg_fn purity timepoint

./PhylogicNDT.py Cluster -i Patient_ID  -sif Patient.sif

the .maf should contain pre-computed raw ccf histograms based on mutations alt/ref count (Absolute annotated mafs or .Rdata files are also supported) if the ccf histograms are absent - the --maf_input_type flag must be set to calc_ccf and sample purity must be provided. Also local copy number must be attached to each mutation in the maf with columns named local_cn_a1 and local_cn_a2

CN_seg is optional to annotate copy-number information on the trees

To specify number of iterations:

./PhylogicNDT.py Cluster -ni 1000

_{Acknowledgment: Clustering Module is partially inspired (primary 1D clustering) by earlier work of Carter & Getz (Landau D, Carter S , Stojanov P et al. Cell 152, 714–726, 2013)}

BuildTree (and GrowthKinetics)

The GrowthKinetics module fully incorporates the BuildTree libraries, so when rates are desired, there is no need to run both.

The -w flag should provide a measure of tumor burden, with one value per input sample maf in clustering. When ommited, stable tumor burden is assumed.
The -t flag should provide relative time for spacing the samples. When omitted, equal spacing is assumed.

Just BuildTree

./PhylogicNDT.py BuildTree -i Indiv_ID -sif Patient.sif  -m mutation_ccf_file -c cluster_ccf_file

GrowthKinetics

./PhylogicNDT.py GrowthKinetics -i Indiv_ID -sif Patient.sif -ab cell_population_abundance_mcmc_trace -w 10 10 10 10 10 -t 1 2 3 4 5

Run Cluster together with BuildTree

./PhylogicNDT.py Cluster -i Patient_ID  -sif Patient.sif -rb

SinglePatientTiming

SinglePatientTiming requires a maf input and a seg file input for each sample. The maf file should be the output of PhylogicNDT Clustering module. The seg file should have the following columns:

Chromosome  Start   End A1.Seg.CN   A2.Seg.CN

To run SinglePatientTiming:

./PhylogicNDT.py Timing -i Indiv_ID -sif Patient.sif

LeagueModel

LeagueModel requires an input of comparison tables. The comparison tables should be the output of SinglePatientTiming ending in ".comp.tsv"

To run LeagueModel:

./PhylogicNDT.py LeagueModel -cohort Cohort -comps comp1 comp2 ... compN

Alternatively, one can use a single aggregated table. The table should have the following columns:

sample  event1  event2  p_event1_win    p_event2_win    unknown

To run with the aggregated table:

./PhylogicNDT.py LeagueModel -cohort Cohort -comparison_cn comps

PhylogicSim

A simulation module is provided for convenience.

./PhylogicNDT.py PhylogicSim --help

Command to visualize all the options and help.

./PhylogicNDT.py PhylogicSim

Run the simulation with the default paramters.

./PhylogicNDT.py PhylogicSim -i MySimulation

Specify a prefix for all the output files

./PhylogicNDT.py PhylogicSim -i MySimulation -ns 7

Specify the number of samples you want to simulate.

./PhylogicNDT.py PhylogicSim -i MySimulation -nodes 5

Specify the number of distinct clones present in your samples. Minimum 2 (The first clone is always the clonal clone)

./PhylogicNDT.py PhylogicSim -i MySimulation -nodes 5 -seg /Example_SegFile.txt

Specify a segment file with copy number values to sample from. See the "Example_SegFile.txt" for a format example. If no file is specified, a build-in CN profile is used, based on the hg19 contigs.

./PhylogicNDT.py PhylogicSim -i MySimulation -nodes 5 -clust_file /Example_Clust_File.txt

Force the ccf values of each cluster on each sample, instead of generating a new random phylogeny from scratch. If -clust_file is specified, the -ns and -nodes flags are ignored an instead replaced with the values from the Clust_File. Each line of the tsv file represents a sample, with each tab separated value the ccf of a cluster. The last value of each line must always be -1 to account for the artifact cluster.

./PhylogicNDT.py PhylogicSim -i MySimulation -nodes 5 -clust_file /Example_Clust_File.txt -a 0.3

Specify the proportion of mutations that are artifactual (Random af unrelated to mutation/CN). Can be combined with a clust_file.

./PhylogicNDT.py PhylogicSim -i MySimulation -nodes 5 -clust_file /Example_Clust_File.txt -pfile /Example_PurityFile.txt

TSV file to specify the purity of each sample individualy (Otherwise, the purity is specified for all the samples using the -p flag.). Each line represents a sample. The file can optionally contain an extra three columns with the alpha, beta and N values for the coverage betabinomial for each sample (Otherwise, those values are set for all samples using the -ap, -b and -nb flags respectively).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhylogicNDT

Installation

Manual Install

Docker Install

Using the Package

Clustering

BuildTree (and GrowthKinetics)

SinglePatientTiming

LeagueModel

PhylogicSim

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
BuildTree		BuildTree
Cluster		Cluster
ExampleData		ExampleData
ExampleRuns		ExampleRuns
GrowthKinetics		GrowthKinetics
LeagueModel		LeagueModel
PhylogicSim		PhylogicSim
SinglePatientTiming		SinglePatientTiming
data		data
output		output
utils		utils
Dockerfile		Dockerfile
LICENSE		LICENSE
PhylogicNDT.py		PhylogicNDT.py
README.md		README.md
req		req

License

OpenGenomics/PhylogicNDT

Folders and files

Latest commit

History

Repository files navigation

PhylogicNDT

Installation

Manual Install

Docker Install

Using the Package

Clustering

BuildTree (and GrowthKinetics)

SinglePatientTiming

LeagueModel

PhylogicSim

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages