Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run de seq 1 0 #28

Closed
wants to merge 66 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
a2dc405
Adding script to run DESeq analysis
sangeetashukla Jun 22, 2021
7e078e8
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Jun 22, 2021
ea59bdf
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Jun 24, 2021
c57b611
Committing updated module
sangeetashukla Jul 15, 2021
e8a7531
Committing updated Uberon codes
sangeetashukla Jul 15, 2021
3d86fd5
Merge branch 'run-DESeq-1-0' of https://github.com/sangeetashukla/Ope…
sangeetashukla Jul 15, 2021
06407db
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Jul 15, 2021
83871fe
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
ccc4463
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
84a6fb2
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
1b0edb2
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
0233eb4
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
7a6c6a4
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
839de71
Update analyses/DESeq_analysis/run-DESeq-analysis.R
sangeetashukla Jul 16, 2021
bb072b1
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Jul 16, 2021
4d9da10
Fixed the data file paths to run from the .\analyses. Fixed the varia…
sangeetashukla Jul 20, 2021
2f89da7
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Jul 20, 2021
7799441
Merge branch 'run-DESeq-1-0' of https://github.com/sangeetashukla/Ope…
sangeetashukla Jul 20, 2021
2a4f68c
Fixed a few typos. Will edit the script later again to fix the path f…
sangeetashukla Jul 22, 2021
db15985
Edited R script to accept data files as command line arguments, added…
sangeetashukla Jul 28, 2021
56dd86e
Update run-DESeq-analysis.R
sangeetashukla Jul 28, 2021
16ec579
Added a line to install an R package
sangeetashukla Jul 28, 2021
d5b1995
Added a line to install an R package
sangeetashukla Jul 28, 2021
e05db28
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Jul 28, 2021
71ee207
Edit dockerfile to include call to shell script
sangeetashukla Aug 2, 2021
5a968c0
Edited file to work with a shell script for call and input files, fix…
sangeetashukla Aug 2, 2021
2758172
Adding shell script to call the R script and pass input files
sangeetashukla Aug 2, 2021
c30558a
Merge branch 'run-DESeq-1-0' of https://github.com/sangeetashukla/Ope…
sangeetashukla Aug 2, 2021
e82e251
Updated to fix a bug
sangeetashukla Aug 2, 2021
d07ac29
adding script to transform json files to a single jsonl file
sangeetashukla Aug 3, 2021
0caa7b0
Adding script to transform JSON to JSON L. Edited R script to include…
sangeetashukla Aug 4, 2021
6135f62
Update run-DESeq-analysis.R
sangeetashukla Aug 6, 2021
192cb4c
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Aug 10, 2021
77d99b3
included UBERON codes in output
sangeetashukla Aug 10, 2021
d0d571c
Adding sample result files from sample data for testing
sangeetashukla Aug 10, 2021
8d9b7cd
Remote typos/comments in run_deseq.sh
sangeetashukla Aug 10, 2021
acd943f
Adding scripts to create data for testing of module
sangeetashukla Aug 11, 2021
a4982d1
Adding compressed test data and sample results from that test data pr…
sangeetashukla Aug 11, 2021
b6f5b81
Changed a script name and removed typos
sangeetashukla Aug 11, 2021
67dcccb
Syncing files with intuitive names
sangeetashukla Aug 11, 2021
35a0bd0
Adding README
sangeetashukla Aug 11, 2021
5754b69
Moved the contents to a new script
sangeetashukla Aug 11, 2021
8128338
Moved contents to a new script
sangeetashukla Aug 11, 2021
f2fe521
Removing since the main script will create jsonl files
sangeetashukla Aug 11, 2021
90101bf
Updated README
sangeetashukla Aug 11, 2021
241aacd
Edited file name in the description
sangeetashukla Aug 11, 2021
911b68d
Edited a column name for the final data table.
sangeetashukla Aug 11, 2021
a9520b9
Not required anymore
sangeetashukla Aug 11, 2021
94f4cf9
Not needed anymore
sangeetashukla Aug 11, 2021
ae4bca1
Removing to replace with a new file
sangeetashukla Aug 11, 2021
b83ca38
Dockerfile for CAVATICA app
sangeetashukla Aug 11, 2021
cc121d2
Adding compressed jsonl file for all comparisons as run on HPC
sangeetashukla Aug 23, 2021
22dd32f
Script to calculate HIST_index, GTEx_Index, and subset histology and …
sangeetashukla Aug 23, 2021
f9e71bb
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Aug 23, 2021
8c9f91f
Edited - Takes input files from Input-Subsetting
sangeetashukla Aug 23, 2021
e84cbec
Updated Dockerfile for Cavatica
sangeetashukla Aug 24, 2021
c8fbaf2
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Aug 24, 2021
1960828
Edited to only include R scripts and required R packages
sangeetashukla Aug 24, 2021
d5d42ac
Updated for compatibility with Dockerfile for Cavatica
sangeetashukla Aug 24, 2021
ca76730
Updated for compatibility with Dockerfile for Cavatica
sangeetashukla Aug 24, 2021
414f7fc
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Sep 1, 2021
68a5a8a
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Sep 2, 2021
6cd0284
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Sep 8, 2021
6c6a5b5
Updated to change threshold of sample size to run DESeq per cancer_group
sangeetashukla Sep 8, 2021
eda8746
Updated threshold for sample size to run DESeq analysis per cancer_group
sangeetashukla Sep 8, 2021
374984b
Merge branch 'PediatricOpenTargets:dev' into run-DESeq-1-0
sangeetashukla Sep 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions analyses/DESeq_analysis/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
FROM rocker/tidyverse:3.6.0
# create an R user
ENV USER rstudio



ADD ./analysis/scripts/install_bioc.R install_bioc.R

#ADD ./data/histologies.tsv histologies.tsv
#ADD ./data/gene-counts-rsem-expected_count-collapsed.rds gene-counts-rsem-expected_count-collapsed.rds
#ADD ./data/gene-expression-rsem-tpm-collapsed.rds gene-expression-rsem-tpm-collapsed.rds
#ADD ./data/efo-mondo-map.tsv efo-mondo-map.tsv
#ADD ./data/uberon-map-gtex-subgroup.tsv uberon-map-gtex-subgroup.tsv
#ADD ./data/ensg-hugo-rmtl-mapping.tsv ensg-hugo-rmtl-mapping.tsv

ADD ./analysis/run-DESeq-Input-Subsetting.R run-DESeq-Input-Subsetting.R
ADD ./analysis/run-DESeq-analysis.R run-DESeq-analysis.R
#ADD run_deseq.sh run_deseq.sh
ADD Dockerfile Dockerfile


RUN chmod 755 ./install_bioc.R && ./install_bioc.R DESeq2
#RUN chmod 755 run_deseq.sh


RUN R -e "install.packages('optparse', dependencies = TRUE)"
RUN R -e "install.packages('jsonlite', dependencies = TRUE)"
RUN R -e "install.packages('ggplot2', dependencies = TRUE)"


#WORKDIR /
#RUN chmod +x run_deseq.sh
#RUN ./run_deseq.sh


#CMD Rscript --vanilla run-DESeq-Input-Subsetting.R
# --hist_file ./data/histologies.tsv \
# --counts_file ./data/gene-counts-rsem-expected_count-collapsed.rds \
# --tpm_file gene-expression-rsem-tpm-collapsed.rds \
# --efo_mondo_file data/efo-mondo-map.tsv \
# --gtex_subgroup_uberon data/uberon-map-gtex-subgroup.tsv \
# --ensg_hugo_file data/ensg-hugo-rmtl-mapping.tsv \
# --outdir results
37 changes: 37 additions & 0 deletions analyses/DESeq_analysis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
## Differential Expression of RNA-seq matrices


This module takes histologies data and the corresponding gene counts and tpm data, and performs differential expression analysis for all combinations of GTEx subgroup and histology type.



## Expected Input

Data files must be downloaded using the below script
```
download-data.sh
```




## Scripts

`run-DESeq-analysis.R` - This is the main script that reads the downloaded data files and prints out json and tsv files.

`run_deseq.sh` - This script sets the path for the output tables and calls the above R script.

`process_test_input.R` - This script will create a subset of histologies.tsv. The script searches the cancer_group and cohort combinations that have more than 5 patients with clinical data, and requires the user to specify the desired cancer_group and cohort to use for the new subset creation.

`run_process_test_input.sh` - This script runs the create_test_input.R. This works well with v7 data since the selected cancer_group and cohort to subset for, are reviewed to have more than 5 patients with clinical data. This script will use the newly created subset and process it by running run_DESeq_analysis.R, and create test result table.




## Steps
1) Set working directory to /deseq_analysis
2) For testing with v7, run the script run_process_test_input.sh .
3) Review the path and file created as test data, and the subsequent result tables created
4) In order to run the module for the entire set of data, run run_deseq.sh


Binary file not shown.
Loading