Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Add publication quality Cooccurrence plot to figure generation scripts #639

Merged
11 changes: 10 additions & 1 deletion analyses/interaction-plots/01-create-interaction-plots.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

# JA Shapiro for CCDL 2019
# JA Shapiro for CCDL 2019-2020
#
# Runs scripts/01-process_mutations.R with some default settings.
# Takes one enviroment variable, `OPENPBTA_ALL`, which if 0 runs only
Expand Down Expand Up @@ -49,6 +49,14 @@ if [ "$ALL" -gt "0" ]; then
fi


# Get FLAG file and add header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README for this module should get updated to include a citation for FLAGS, etc.

# include top 50 frequently mutated
exclude_file=FLAGS.tsv
echo gene$'\t'count > $exclude_file
head -n 50 <(curl -s https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706417/bin/12920_2017_309_MOESM3_ESM.txt)\
>> $exclude_file


# make output directories if they don't exist
mkdir -p $results_dir
mkdir -p $plot_dir
Expand All @@ -67,6 +75,7 @@ Rscript ${script_dir}/02-process_mutations.R \
--maf ${maf} \
--metadata ${metadata} \
--specimen_list ${temp_dir}/ALL.tsv \
--exclude_genes $exclude_file \
--vaf 0.05 \
--min_mutated 5 \
--max_genes 50 \
Expand Down
51 changes: 51 additions & 0 deletions analyses/interaction-plots/FLAGS.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
gene count
TTN 2659
MUC16 1222
OBSCN 825
AHNAK2 809
SYNE1 686
FLG 622
MUC5B 552
DNAH17 550
PLEC 543
DST 541
SYNE2 537
NEB 531
HSPG2 515
LAMA5 505
AHNAK 493
HMCN1 484
USH2A 468
DNAH11 445
MACF1 440
MUC17 435
DNAH5 430
GPR98 420
FAT1 412
PKD1 402
MDN1 397
RNF213 396
RYR1 393
DNAH2 389
DNAH3 386
DNAH8 383
DNAH1 381
DNAH9 379
ABCA13 375
APOB 372
SRRM2 371
CUBN 363
SPTBN5 357
PKHD1 353
LRP2 352
FBN3 350
CDH23 349
DNAH10 349
FAT4 348
RYR3 347
PKHD1L1 345
FAT2 344
CSMD1 341
PCNT 341
COL6A3 336
FRAS1 332
10 changes: 7 additions & 3 deletions analyses/interaction-plots/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,19 @@
The scripts in this directory create a plot to display co-occurence and mutual exclusivity of mutations across tumors.

Currently this is done across all tumor types, with all available individuals for which whole genome sequences are available for at least one tumor.

Importantly, only a single sequencing sample from each individual is used.
The analyses and plots created include information from the top 50 most mutated genes.
The analyses and plots created include information from the top 50 most mutated genes, with genes that are commonly mutated removed.
The commonly mutated genes are derived from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267152/, specifically the top 50 genes from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706417/bin/12920_2017_309_MOESM3_ESM.txt

The main script creates plots for the full data set, as well as for groups of specific tumor types.
In the case of the full data set, a bar plot is also produced that summarizes which tumor types are mutated for each of the most common genes, as well as a publication-ready figure that combines the co-occurence plot and the bar plot.

Future analyses will include creating plots by tumor type, and for specific lists of genes of interest.


### Example plot

![Co-occurence Plot](plots/lancet_top50.png)
![Co-occurence Plot](plots/consensus_top50.png)

## Usage

Expand Down
Binary file modified analyses/interaction-plots/plots/combined_top50.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified analyses/interaction-plots/plots/cooccur_top50.ALL.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified analyses/interaction-plots/plots/gene_disease_top50.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,450 changes: 1,225 additions & 1,225 deletions analyses/interaction-plots/results/cooccur_top50.ALL.tsv

Large diffs are not rendered by default.

142 changes: 59 additions & 83 deletions analyses/interaction-plots/results/gene_disease_top50.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -11,120 +11,99 @@ ATRX High-grade glioma 9 16 1.7777777777777777
EGFR High-grade glioma 9 14 1.5555555555555556
NF1 High-grade glioma 9 14 1.5555555555555556
DDX3X Medulloblastoma 9 10 1.1111111111111112
OBSCN High-grade glioma 8 28 3.5
FGFR1 Low-grade astrocytic tumor 8 12 1.5
KMT2D Medulloblastoma 8 9 1.125
NF1 Diffuse midline glioma 7 10 1.4285714285714286
NF2 Meningioma 7 7 1
PPM1D Diffuse midline glioma 7 7 1
TTN High-grade glioma 6 72 12
DNAH8 High-grade glioma 6 12 2
KIAA1109 High-grade glioma 6 14 2.3333333333333335
MAGEL2 High-grade glioma 6 10 1.6666666666666667
TTN Medulloblastoma 6 8 1.3333333333333333
KDM6A Medulloblastoma 6 6 1
KMT2C Medulloblastoma 6 6 1
PTEN High-grade glioma 6 6 1
TP53 Medulloblastoma 6 6 1
NEB High-grade glioma 5 24 4.8
RYR2 High-grade glioma 5 21 4.2
SYNE1 High-grade glioma 5 20 4
FSIP2 High-grade glioma 5 18 3.6
ABCA13 High-grade glioma 5 13 2.6
DNAH5 High-grade glioma 5 13 2.6
LRP1 High-grade glioma 5 13 2.6
KIAA1549 High-grade glioma 5 12 2.4
PCLO High-grade glioma 5 12 2.4
SETD2 High-grade glioma 5 12 2.4
SMARCA4 High-grade glioma 5 12 2.4
ATM High-grade glioma 5 11 2.2
DNHD1 High-grade glioma 5 11 2.2
FBN2 High-grade glioma 5 9 1.8
APOB High-grade glioma 5 8 1.6
BCOR High-grade glioma 5 7 1.4
BRAF High-grade glioma 5 6 1.2
PIK3CA High-grade glioma 5 6 1.2
SMARCA4 Medulloblastoma 5 6 1.2
MUC16 Diffuse midline glioma 5 5 1
PDGFRA High-grade glioma 5 5 1
PIK3CA Low-grade astrocytic tumor 5 5 1
PLEC High-grade glioma 4 31 7.75
MUC16 High-grade glioma 4 16 4
PIK3R1 High-grade glioma 5 5 1
KMT2D High-grade glioma 4 15 3.75
DNAH17 High-grade glioma 4 14 3.5
FAT1 High-grade glioma 4 13 3.25
ADGRV1 High-grade glioma 4 13 3.25
PRKDC High-grade glioma 4 13 3.25
CCDC168 High-grade glioma 4 12 3
MYH14 High-grade glioma 4 12 3
KMT2C High-grade glioma 4 11 2.75
MYO10 High-grade glioma 4 11 2.75
CENPF High-grade glioma 4 10 2.5
DNAH14 High-grade glioma 4 10 2.5
LRP1B High-grade glioma 4 9 2.25
PTPRZ1 High-grade glioma 4 7 1.75
BCOR Low-grade astrocytic tumor 4 4 1
COL4A3 High-grade glioma 4 4 1
H3F3A High-grade glioma 4 4 1
IGSF10 High-grade glioma 4 4 1
NF1 Low-grade astrocytic tumor 4 4 1
PTPN11 Diffuse midline glioma 4 4 1
TTN Diffuse midline glioma 4 4 1
RYR1 High-grade glioma 3 15 5
DNAH3 High-grade glioma 3 10 3.3333333333333335
MUC5AC High-grade glioma 3 11 3.6666666666666665
RP1L1 High-grade glioma 3 10 3.3333333333333335
CELSR1 High-grade glioma 3 9 3
PHLPP1 High-grade glioma 3 7 2.3333333333333335
DNAH11 High-grade glioma 3 6 2
PTPN11 High-grade glioma 3 6 2
FGFR1 High-grade glioma 3 4 1.3333333333333333
NF2 Ependymoma 3 4 1.3333333333333333
NF2 Schwannoma 3 4 1.3333333333333333
PTEN Diffuse midline glioma 3 4 1.3333333333333333
DNAH5 Diffuse midline glioma 3 3 1
IDH1 High-grade glioma 3 3 1
IDH1 Low-grade astrocytic tumor 3 3 1
KRAS Low-grade astrocytic tumor 3 3 1
MUC16 Low-grade astrocytic tumor 3 3 1
PIK3CA Diffuse midline glioma 3 3 1
PPM1D High-grade glioma 3 3 1
RYR1 Diffuse midline glioma 3 3 1
STAG2 Medulloblastoma 3 3 1
PDGFRA Dysembryoplastic neuroepithelial tumor 2 4 2
PDGFRA Low-grade astrocytic tumor 2 4 2
FGFR1 Dysembryoplastic neuroepithelial tumor 2 3 1.5
RYR1 Medulloblastoma 2 3 1.5
STAG2 Ewings Sarcoma 2 3 1.5
APOB Diffuse midline glioma 2 2 1
ATM Low-grade astrocytic tumor 2 2 1
BRAF Diffuse midline glioma 2 2 1
CENPF CNS Embryonal Tumor 2 2 1
DDX3X High-grade glioma 2 2 1
DNAH11 Diffuse midline glioma 2 2 1
DNAH14 Diffuse midline glioma 2 2 1
DNAH14 Medulloblastoma 2 2 1
DNAH3 Medulloblastoma 2 2 1
FBN2 Medulloblastoma 2 2 1
IDH1 Ganglioglioma 2 2 1
IGSF10 Medulloblastoma 2 2 1
KDM6A High-grade glioma 2 2 1
KRAS Ganglioglioma 2 2 1
KRAS Germinoma 2 2 1
KRAS Neurofibroma 2 2 1
LRP1B Ganglioglioma 2 2 1
PDGFRA Diffuse midline glioma 2 2 1
RYR1 Low-grade astrocytic tumor 2 2 1
PIK3R1 Low-grade astrocytic tumor 2 2 1
RYR2 Medulloblastoma 2 2 1
SETD2 Medulloblastoma 2 2 1
STAG2 High-grade glioma 2 2 1
STAG2 Low-grade astrocytic tumor 2 2 1
TP53 Low-grade astrocytic tumor 2 2 1
TTN Craniopharyngioma 2 2 1
DNAH17 Diffuse midline glioma 1 2 2
DNAH3 Neurofibroma 1 2 2
FBN2 Neuroblastoma 1 2 2
FGFR1 Dysplasia 1 2 2
FGFR1 Glial-neuronal tumor NOS 1 2 2
MUC16 Neuroblastoma 1 2 2
NF2 High-grade glioma 1 2 2
PLEC Ependymoma 1 2 2
PTEN Medulloblastoma 1 2 2
RYR1 CNS Neuroblastoma 1 2 2
SYNE1 Neuroblastoma 1 2 2
TTN Neuroblastoma 1 2 2
ABCA13 Choroid plexus papilloma 1 1 1
ABCA13 CNS Embryonal Tumor 1 1 1
ABCA13 Medulloblastoma 1 1 1
ABCA13 Meningioma 1 1 1
ABCA13 Metastatic secondary tumors 1 1 1
ABCA13 Neuroblastoma 1 1 1
APOB Craniopharyngioma 1 1 1
APOB Neuroblastoma 1 1 1
ADGRV1 Low-grade astrocytic tumor 1 1 1
ADGRV1 Medulloblastoma 1 1 1
ADGRV1 Neuroblastoma 1 1 1
ATM CNS Embryonal Tumor 1 1 1
ATM Diffuse midline glioma 1 1 1
ATM Medulloblastoma 1 1 1
Expand All @@ -141,69 +120,65 @@ CCDC168 Atypical Teratoid Rhabdoid Tumor 1 1 1
CCDC168 Craniopharyngioma 1 1 1
CCDC168 Diffuse midline glioma 1 1 1
CCDC168 Germinoma 1 1 1
CELSR1 Diffuse midline glioma 1 1 1
CELSR1 Medulloblastoma 1 1 1
CELSR1 Metastatic secondary tumors 1 1 1
CELSR1 Neuroblastoma 1 1 1
CENPF Low-grade astrocytic tumor 1 1 1
COL4A3 Atypical Teratoid Rhabdoid Tumor 1 1 1
COL4A3 CNS Neuroblastoma 1 1 1
COL4A3 Diffuse midline glioma 1 1 1
COL4A3 Neurofibroma 1 1 1
CTNNB1 CNS Embryonal Tumor 1 1 1
CTNNB1 High-grade glioma 1 1 1
DNAH11 Hemangioblastoma 1 1 1
DNAH11 Medulloblastoma 1 1 1
DNAH11 Neuroblastoma 1 1 1
DNAH14 Low-grade astrocytic tumor 1 1 1
DNAH14 Neuroblastoma 1 1 1
DNAH17 Low-grade astrocytic tumor 1 1 1
DNAH17 Medulloblastoma 1 1 1
DNAH17 Neuroblastoma 1 1 1
DNAH3 Diffuse midline glioma 1 1 1
DNAH3 Neuroblastoma 1 1 1
DNAH5 Atypical Teratoid Rhabdoid Tumor 1 1 1
DNAH5 Low-grade astrocytic tumor 1 1 1
DNAH5 Meningioma 1 1 1
DNAH5 Metastatic secondary tumors 1 1 1
DNAH5 Neuroblastoma 1 1 1
DNAH5 Neurofibroma 1 1 1
DNAH8 Low-grade astrocytic tumor 1 1 1
DNAH8 Neuroblastoma 1 1 1
DNAH8 Oligodendroglioma 1 1 1
DNHD1 Medulloblastoma 1 1 1
DNHD1 Neuroblastoma 1 1 1
EGFR Diffuse midline glioma 1 1 1
FAT1 Diffuse midline glioma 1 1 1
FAT1 Ependymoma 1 1 1
FAT1 Low-grade astrocytic tumor 1 1 1
FAT1 Medulloblastoma 1 1 1
FBN2 Low-grade astrocytic tumor 1 1 1
FGFR1 Oligodendroglioma 1 1 1
FSIP2 CNS Neuroblastoma 1 1 1
FSIP2 Embryonal tumor with multilayer rosettes 1 1 1
FSIP2 Metastatic secondary tumors 1 1 1
FSIP2 Neuroblastoma 1 1 1
IDH1 Oligodendroglioma 1 1 1
IGSF10 Diffuse midline glioma 1 1 1
IGSF10 Pineoblastoma 1 1 1
KDM6A Low-grade astrocytic tumor 1 1 1
KIAA1109 Neuroblastoma 1 1 1
KIAA1549 CNS Embryonal Tumor 1 1 1
KIAA1549 Ependymoma 1 1 1
KIAA1549 Medulloblastoma 1 1 1
KMT2C Atypical Teratoid Rhabdoid Tumor 1 1 1
KMT2C Neuroblastoma 1 1 1
KMT2D Craniopharyngioma 1 1 1
KMT2D Primary CNS lymphoma 1 1 1
LRP1 Craniopharyngioma 1 1 1
LRP1 Diffuse midline glioma 1 1 1
LRP1B Atypical Teratoid Rhabdoid Tumor 1 1 1
LRP1B Metastatic secondary tumors 1 1 1
LRP1B Neuroblastoma 1 1 1
LRP1B Sarcoma 1 1 1
MAGEL2 Medulloblastoma 1 1 1
MAGEL2 Rosai-Dorfman Disease 1 1 1
MUC16 Dysembryoplastic neuroepithelial tumor 1 1 1
MUC16 Medulloblastoma 1 1 1
NEB CNS Neuroblastoma 1 1 1
NEB Ependymoma 1 1 1
NEB Medulloblastoma 1 1 1
NEB Neuroblastoma 1 1 1
MUC5AC Diffuse midline glioma 1 1 1
MUC5AC Ependymoma 1 1 1
MUC5AC Ewings Sarcoma 1 1 1
MUC5AC Neuroblastoma 1 1 1
MYH14 CNS Embryonal Tumor 1 1 1
MYH14 CNS Neuroblastoma 1 1 1
MYH14 Medulloblastoma 1 1 1
MYO10 Ependymoma 1 1 1
MYO10 Medulloblastoma 1 1 1
MYO10 Neuroblastoma 1 1 1
NF1 Dysembryoplastic neuroepithelial tumor 1 1 1
NF1 Dysplasia 1 1 1
NF1 Ganglioglioma 1 1 1
NF1 Germinoma 1 1 1
NF1 Neuroblastoma 1 1 1
NF2 Medulloblastoma 1 1 1
NF2 Sarcoma 1 1 1
OBSCN CNS Neuroblastoma 1 1 1
OBSCN Diffuse midline glioma 1 1 1
OBSCN Neuroblastoma 1 1 1
PCLO Craniopharyngioma 1 1 1
PCLO Medulloblastoma 1 1 1
PCLO Neuroblastoma 1 1 1
Expand All @@ -214,14 +189,22 @@ PHLPP1 Neuroblastoma 1 1 1
PHLPP1 Teratoma 1 1 1
PIK3CA Craniopharyngioma 1 1 1
PIK3CA Medulloblastoma 1 1 1
PLEC Atypical Teratoid Rhabdoid Tumor 1 1 1
PLEC CNS Neuroblastoma 1 1 1
PLEC Diffuse midline glioma 1 1 1
PIK3R1 Diffuse midline glioma 1 1 1
PRKDC Ependymoma 1 1 1
PRKDC Medulloblastoma 1 1 1
PRKDC Neuroblastoma 1 1 1
PTPN11 Dysembryoplastic neuroepithelial tumor 1 1 1
PTPN11 Low-grade astrocytic tumor 1 1 1
PTPN11 Metastatic secondary tumors 1 1 1
PTPN11 Oligodendroglioma 1 1 1
RYR1 Sarcoma 1 1 1
PTPRZ1 Ependymoma 1 1 1
PTPRZ1 Low-grade astrocytic tumor 1 1 1
PTPRZ1 Metastatic secondary tumors 1 1 1
PTPRZ1 Neuroblastoma 1 1 1
RP1L1 CNS Embryonal Tumor 1 1 1
RP1L1 Diffuse midline glioma 1 1 1
RP1L1 Medulloblastoma 1 1 1
RP1L1 Neuroblastoma 1 1 1
RYR2 Diffuse midline glioma 1 1 1
RYR2 Ependymoma 1 1 1
RYR2 Low-grade astrocytic tumor 1 1 1
Expand All @@ -232,12 +215,5 @@ SETD2 Low-grade astrocytic tumor 1 1 1
SMARCA4 Craniopharyngioma 1 1 1
SMARCA4 Low-grade astrocytic tumor 1 1 1
STAG2 Ependymoma 1 1 1
SYNE1 Medulloblastoma 1 1 1
SYNE1 Meningioma 1 1 1
SYNE1 Sarcoma 1 1 1
TP53 Myeloid Sarcoma 1 1 1
TP53 Oligodendroglioma 1 1 1
TTN CNS Embryonal Tumor 1 1 1
TTN CNS Neuroblastoma 1 1 1
TTN Dysembryoplastic neuroepithelial tumor 1 1 1
TTN Neurofibroma 1 1 1
Loading