Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Update focal-cn-file-preparation README #525

Merged
merged 3 commits into from
Feb 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion analyses/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Note that _nearly all_ modules use the harmonized clinical data file (`pbta-hist
| [`comparative-RNASeq-analysis`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/comparative-RNASeq-analysis) | `pbta-gene-expression-rsem-tpm.polya.rds` <br> `pbta-gene-expression-rsem-tpm.stranded.rds` | *In progress*; will produce expression outlier profiles per [#229](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/229) | N/A |
| [`copy_number_consensus_call`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/copy_number_consensus_call) | `pbta-cnv-cnvkit.seg.gz` <br> `pbta-cnv-controlfreec.tsv.gz` <br> `pbta-sv-manta.tsv.gz` | Produces consensus copy number calls per [#128](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/128) and a set of excluded regions where CNV calls are not made | `results/cnv_consensus.tsv` <br> `results/pbta-cnv-consensus.seg.gz` <br> `ref/cnv_excluded_regions.bed` <br> `ref/cnv_callable.bed`
| [`create-subset-files`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/create-subset-files) | All files | This module contains the code to create the subset files used in continuous integration | All subset files for continuous integration
| [`focal-cn-file-preparation`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/focal-cn-file-preparation) | `pbta-cnv-cnvkit.seg.gz` <br> `pbta-cnv-controlfreec.tsv.gz` <br> `pbta-gene-expression-rsem-fpkm.polya.rds` <br> `pbta-gene-expression-rsem-fpkm.stranded.rds` <br> `analyses/copy_number_consensus_call/results/pbta-cnv-consensus.seg.gz` | Maps from copy number variant caller segments to gene identifiers; will be updated to take into account changes that affect entire cytobands, chromosome arms ([#186](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/186))| `results/cnvkit_annotated_cn_autosomes.tsv.gz` <br> `results/cnvkit_annotated_cn_x_and_y.tsv.gz` <br> `results/controlfreec_annotated_cn_autosomes.tsv.gz` <br> `results/controlfreec_annotated_cn_x_and_y.tsv.gz` <br> `results/consensus_seg_annotated_cn_autosomes.tsv.gz` <br> `results/consensus_seg_annotated_cn_x_and_y.tsv.gz`
| [`focal-cn-file-preparation`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/focal-cn-file-preparation) | `pbta-cnv-cnvkit.seg.gz` <br> `pbta-cnv-controlfreec.tsv.gz` <br> `pbta-gene-expression-rsem-fpkm-collapsed.polya.rds` <br> `pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds` <br> `analyses/copy_number_consensus_call/results/pbta-cnv-consensus.seg.gz` | Maps from copy number variant caller segments to gene identifiers; will be updated to take into account changes that affect entire cytobands, chromosome arms ([#186](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/186))| `results/cnvkit_annotated_cn_autosomes.tsv.gz` <br> `results/cnvkit_annotated_cn_x_and_y.tsv.gz` <br> `results/controlfreec_annotated_cn_autosomes.tsv.gz` <br> `results/controlfreec_annotated_cn_x_and_y.tsv.gz` <br> `results/consensus_seg_annotated_cn_autosomes.tsv.gz` <br> `results/consensus_seg_annotated_cn_x_and_y.tsv.gz`
| [`fusion_filtering`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/fusion_filtering) | `pbta-fusion-arriba.tsv.gz` <br> `pbta-fusion-starfusion.tsv.gz` | Standardizes, filters, and prioritizes fusion calls | `results/pbta-fusion-putative-oncogenic.tsv` <br> `results/pbta-fusion-recurrent-fusion-byhistology.tsv` <br> `results/pbta-fusion-recurrent-fusion-bysample.tsv` (included in data download)
| [`fusion-summary`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/fusion-summary)| `pbta-histologies.tsv` <br> `pbta-fusion-putative-oncogenic.tsv` <br> `pbta-fusion-arriba.tsv.gz` <br> `pbta-fusion-starfusion.tsv.gz` | Generate summary tables from fusion files ([#398](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/398)) | `results/fusion_summary_embryonal_foi.tsv` <br> `results/fusion_summary_ependymoma_foi.tsv`
| [`gene-set-enrichment-analysis`](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/gene-set-enrichment-analysis) | `pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds` <br> `pbta-gene-expression-rsem-fpkm-collapsed.polya.rds` | *In progress*. Updated gene set enrichment analysis with appropriate RNA-seq expression data | `results/gsva_scores_stranded.tsv` <br> `results/gsva_scores_polya.tsv` <br> for stranded, polya expression data respectively
Expand Down
66 changes: 64 additions & 2 deletions analyses/focal-cn-file-preparation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@ See the notebook for more information.
| biospecimen_id | status | copy_number | ploidy | ensembl | gene_symbol | cytoband |
|----------------|--------|-------------|--------|---------|-------------|---------|

* `rna-expression-validation.R` - This script examines RNA-seq expression levels (RSEM FPKM) of genes that are called as deletions.
It is not currently run via the shell script.
* `rna-expression-validation.R` - This script examines RNA-seq expression levels (RSEM FPKM) of genes that are called as deletions.
It produces loss/neutral and zero/neutral correlation plots, as well as stacked barplots displaying the distribution of ranges in expression across each of the calls (loss, neutral, zero).
_Note: The shell script's default behavior is to produce these plots using the annotated consensus SEG autosome and sex chromsome files found in this module's `results` directory and listed below._

### Output files for downstream consumption

Expand All @@ -43,3 +44,64 @@ results
├── controlfreec_annotated_cn_autosomes.tsv.gz
└── controlfreec_annotated_cn_x_and_y.tsv.gz
```

### Folder Structure

```
focal-cn-file-preparation
├── 01-add-ploidy-cnvkit.Rmd
├── 01-add-ploidy-cnvkit.nb.html
├── 02-add-ploidy-consensus.Rmd
├── 02-add-ploidy-consensus.nb.html
├── 03-prepare-cn-file.R
├── README.md
├── display-plots.md
├── plots
│   ├── cnvkit_annotated_cn_autosomes_polya_loss_cor_plot.png
│   ├── cnvkit_annotated_cn_autosomes_polya_stacked_plot.png
│   ├── cnvkit_annotated_cn_autosomes_polya_zero_cor_plot.png
│   ├── cnvkit_annotated_cn_autosomes_stranded_loss_cor_plot.png
│   ├── cnvkit_annotated_cn_autosomes_stranded_stacked_plot.png
│   ├── cnvkit_annotated_cn_autosomes_stranded_zero_cor_plot.png
│   ├── cnvkit_annotated_cn_x_and_y_polya_loss_cor_plot.png
│   ├── cnvkit_annotated_cn_x_and_y_polya_stacked_plot.png
│   ├── cnvkit_annotated_cn_x_and_y_polya_zero_cor_plot.png
│   ├── cnvkit_annotated_cn_x_and_y_stranded_loss_cor_plot.png
│   ├── cnvkit_annotated_cn_x_and_y_stranded_stacked_plot.png
│   ├── cnvkit_annotated_cn_x_and_y_stranded_zero_cor_plot.png
│   ├── consensus_seg_annotated_cn_autosomes_polya_loss_cor_plot.png
│   ├── consensus_seg_annotated_cn_autosomes_polya_stacked_plot.png
│   ├── consensus_seg_annotated_cn_autosomes_polya_zero_cor_plot.png
│   ├── consensus_seg_annotated_cn_autosomes_stranded_loss_cor_plot.png
│   ├── consensus_seg_annotated_cn_autosomes_stranded_stacked_plot.png
│   ├── consensus_seg_annotated_cn_autosomes_stranded_zero_cor_plot.png
│   ├── consensus_seg_annotated_cn_x_and_y_polya_loss_cor_plot.png
│   ├── consensus_seg_annotated_cn_x_and_y_polya_stacked_plot.png
│   ├── consensus_seg_annotated_cn_x_and_y_polya_zero_cor_plot.png
│   ├── consensus_seg_annotated_cn_x_and_y_stranded_loss_cor_plot.png
│   ├── consensus_seg_annotated_cn_x_and_y_stranded_stacked_plot.png
│   ├── consensus_seg_annotated_cn_x_and_y_stranded_zero_cor_plot.png
│   ├── controlfreec_annotated_cn_autosomes_polya_loss_cor_plot.png
│   ├── controlfreec_annotated_cn_autosomes_polya_stacked_plot.png
│   ├── controlfreec_annotated_cn_autosomes_polya_zero_cor_plot.png
│   ├── controlfreec_annotated_cn_autosomes_stranded_loss_cor_plot.png
│   ├── controlfreec_annotated_cn_autosomes_stranded_stacked_plot.png
│   ├── controlfreec_annotated_cn_autosomes_stranded_zero_cor_plot.png
│   ├── controlfreec_annotated_cn_x_and_y_polya_loss_cor_plot.png
│   ├── controlfreec_annotated_cn_x_and_y_polya_stacked_plot.png
│   ├── controlfreec_annotated_cn_x_and_y_polya_zero_cor_plot.png
│   ├── controlfreec_annotated_cn_x_and_y_stranded_loss_cor_plot.png
│   ├── controlfreec_annotated_cn_x_and_y_stranded_stacked_plot.png
│   └── controlfreec_annotated_cn_x_and_y_stranded_zero_cor_plot.png
├── results
│   ├── cnvkit_annotated_cn_autosomes.tsv.gz
│   ├── cnvkit_annotated_cn_x_and_y.tsv.gz
│   ├── consensus_seg_annotated_cn_autosomes.tsv.gz
│   ├── consensus_seg_annotated_cn_x_and_y.tsv.gz
│   ├── controlfreec_annotated_cn_autosomes.tsv.gz
│   └── controlfreec_annotated_cn_x_and_y.tsv.gz
├── rna-expression-validation.R
├── run-prepare-cn.sh
└── util
└── rna-expression-functions.R
```