Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

CNV consensus calls methods add #93

Merged
merged 5 commits into from
Apr 10, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions content/03.methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ The 0.05 VAF increased the true positive rate for INDELs and decreased the false
The final VCF was filtered for PASS variants with TYPE=StronglySomatic.
Lancet v1.0.7 was run using default parameters, except for those noted below.
For input intervals to Lancet WGS, a reference BED was created by using only the UTR, exome, and start/stop codon features of the GENCODE 31 reference, augmented as recommended with PASS variant calls from Strelka2 and Mutect2 [@doi:10.1101/623702].
These intervals were then padded by 300 bp on each side during Lancet variant calling.
These intervals were then padded by 300 bp on each side during Lancet variant calling.
Per recommendations by the New York Genome Center [@doi:10.1101/623702], for WGS samples, the Lancet input intervals described above were augmented with PASS variant calls from Strelka2 and Mutect2 as validation.

#### VCF annotation and MAF creation
Expand Down Expand Up @@ -140,6 +140,18 @@ Theta2 purity was added as an optional parameter to CNVkit to adjust copy number
CNVkit was run on human genome reference hg38 using the optional parameters of Theta2 purity and BAF adjustment for tumor-normal pairs.
We used GISTIC [@doi:10.1186/gb-2011-12-4-r41] v.2.0.23 on the CNVkit and the consensus CNV segmentation files to generate gene-level copy number abundance (Log R Ratio) as well as chromosomal arm copy number alterations using the parameters specified in the [OpenPBTA Analysis repository](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/run-gistic/scripts/run-gistic-openpbta.sh).

#### Consensus CNV Calling

For each caller and sample, CNVs were called based on consensus amongst ControlFreeC [@doi:10/ckt4vz; @doi:10/c6bcps], CNVKit [@doi:10.1371/journal.pcbi.1004873], and Manta [@doi:10/gf3ggb].
CNVs called significant by ControlFreeC (pval = 0.01) were included in the consensus calling.
cansavvy marked this conversation as resolved.
Show resolved Hide resolved

The calls from a given sample were compared pairwise between the three callers.
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
CNVs of the same sample and call method were merged if they overlapped 10,000 bp.
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
If a CNV from one caller overlapped 50% or more with at least one CNV from another caller, the common region of the overlapping CNV was considered the new consensus CNV.
cansavvy marked this conversation as resolved.
Show resolved Hide resolved

We filtered out any CNVs that overlapped 50% or more with immunoglobulin, telomeric, centromeric, segment duplicated regions or were longer than 3000bp.
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
Sample and caller combination files with more than 2500 CNVs called were removed from the set; we believe these to be noisy and poor quality samples (this is based on cutoffs used in GISTIC methods [@doi:10.1186/gb-2011-12-4-r41]).
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
cansavvy marked this conversation as resolved.
Show resolved Hide resolved

### Somatic Structural Variant Calling (WGS samples only)

We used Manta SV [@doi:10/gf3ggb] v1.4.0 for structural variant (SV) calls.
Expand Down Expand Up @@ -310,7 +322,7 @@ High-grade glioma (HGG) subtypes were derived using the criteria below (addition
5. If a sample was initially classified as HGAT, had no defining histone mutations, and a BRAF V600E mutation, it was subtyped as `BRAF V600E`.
6. All other high-grade glioma samples that did not meet any of these criteria were subtyped as `HGG, H3 wildtype`.

Non-MB and non-ATRT embryonal (`Embryonal tumor` in the `broad_histology` column of the metadata pbta-histologies.tsv) subtypes were derived using the criteria below [@pmid:30249036; @doi:10.1007/s00381-017-3551-6; @url:https://www.cancer.gov/types/brain/hp/child-cns-embryonal-treatment-pdq; @doi:10.3390/ijms21051818].
Non-MB and non-ATRT embryonal (`Embryonal tumor` in the `broad_histology` column of the metadata pbta-histologies.tsv) subtypes were derived using the criteria below [@pmid:30249036; @doi:10.1007/s00381-017-3551-6; @url:https://www.cancer.gov/types/brain/hp/child-cns-embryonal-treatment-pdq; @doi:10.3390/ijms21051818].
Additional details can be found in the analysis [notebook](https://alexslemonade.github.io/OpenPBTA-analysis/analyses/molecular-subtyping-embryonal/04-table-prep.nb.html).

1. Any RNA-seq biospecimen with <i>LIN28A</i>a overexpression, plus a <i>TTYH1</i> fusion (5' partner) with a gene adjacent or within the C19MC miRNA cluster and/or copy number amplification of the C19MC region was subtyped as `ETMR, C19MC-altered` (Embryonal tumor with multilayer rosettes, chromosome 19 miRNA cluster altered).
Expand All @@ -320,7 +332,7 @@ Additional details can be found in the analysis [notebook](https://alexslemonade
5. Non-MB and non-ATRT embryonal tumors with over-expression and/or gene fusions in <i>FOXR2</i> were subtyped as `CNS NB-FOXR2` (CNS neuroblastoma with <i>FOXR2</i> activation).
6. Non-MB and non-ATRT embryonal tumors with <i>CIC-NUTM1</i> or other <i>CIC</i> fusions, were subtyped as `CNS EFT-CIC` (CNS Ewing sarcoma family tumor with <i>CIC</i> alteration).
7. Non-MB and non-ATRT embryonal tumors that did not fit any of the above categories were subtyped as CNS Embryonal, NOS (CNS Embryonal tumor, not otherwise specified).


#### Survival

Expand Down