This repository is currently a work in progress of a project aimed at uncovering information from missing cell types, after RNA-bulks have been deconvoluted. Our study aims to answer the question of what happens when we have real and accumulated cells missing from the single-cell dataset in bulk deconvolution methodologies, and whether we are able to retrieve the missing cell-type’s proportions through the calculated proportions.
/preprocessing/
: notebook to process data and create pseudobulks for experiments./exploratory_experiments/
: notebooks containing the deconvolution and factorization of different types of pseudobulks./functions/
: scripts with functions used throughout other folders./data/
: contains data information. See Data Information below.
- Information on all data used can be found in:
Supplemental_Table_2.xlsx
- CIBERSORTx was run through Docker with user-specific code using the exploratory_experiments/Run_cibersortx.py script.
- Codes can be given through the CIBERSORTx team by contacting them through the CIBERSORTx website.
The methodology can be roughly divided into experiment 1, 2, 3 and 4. The following outline describes each file and the rough outline in which the scripts can be run.
/preprocessing/EXP1_pseudos_snadp.ipynb
- Preprocess and creates pseudobulks from SN adipose tissue data.
/exploratory_experiments/EXP1.py
- Runs
EXP1_eval.ipynb
through Papermill. Parameters must be specific inEXP1.py
file.
- Runs
/exploratory_experiments/EXP1_eval.ipynb
- NNLS deconvolution and residual analysis.
/preprocessing/EXP2_pseudos_pbmc.ipynb
- Preprocess and creates pseudobulks from PBMC3k 10x Genomics dataset.
/preprocessing/EXP2_ciber_bayes_prep.ipynb
- Creates files needed to run BayesPrism and CIBERSORTx.
/preprocessing/EXP2_pbmc_BayesPrism.R
- R script for BayesPrism deconvolution of files created in
EXP2_ciber_bayes_prep.ipynb
.
- R script for BayesPrism deconvolution of files created in
/exploratory_experiments/EXP2.py
- Runs
EXP2_eval.ipynb
through Papermill. Parameters must be specific inEXP2.py
file.
- Runs
/exploratory_experiments/EXP2_cibersort_eval.ipynb
- Analyzes the results of CIBERSORTx deconvolution.
/exploratory_experiments/EXP2_bayesprism_eval.ipynb
- Analyzes the results of BayesPrism deconvolution.
/exploratory_experiments/EXP2.py
- Runs
EXP2_eval.ipynb
through Papermill. Parameters must be specific inEXP2.py
file. - This should be run after both BayesPrism and CIBERSORTx are done.
- Runs
/exploratory_experiments/EXP2_eval.ipynb
- Final notebook containing the NNLS deconvolution and analysis, and the comparison between deconvolution methods.
EXP3: NNLS, BayesPrism, and CIBERSORTx deconvolution of pseudobulks from single-nucleus adipose tissue, with single-cell RNA-seq missing real cell-types. Incorporated noise and realistic proportions.
/preprocessing/EXP3_pseudos_snadp.ipynb
- Preprocess and creates pseudobulks from white adipose tissue of single-nucleus and single-cell data.
/preprocessing/EXP3_ciber_bayes_prep.ipynb
- Creates files needed to run BayesPrism and CIBERSORTx.
/preprocessing/EXP3_adp_BayesPrism.R
- R script for BayesPrism deconvolution of files created in
EXP3_ciber_bayes_prep.ipynb
.
- R script for BayesPrism deconvolution of files created in
/exploratory_experiments/EXP3_cibersort_eval.ipynb
- Analyzes the results of CIBERSORTx deconvolution.
/exploratory_experiments/EXP2_bayesprism_eval.ipynb
- Analyzes the results of BayesPrism deconvolution.
/exploratory_experiments/EXP3.py
- Runs
EXP3_eval.ipynb
through Papermill. Parameters must be specific inEXP3.py
file. - This should be run after both BayesPrism and CIBERSORTx are done.
- Runs
/exploratory_experiments/EXP3_eval.ipynb
- Final notebook containing the NNLS deconvolution and analysis, and the comparison between deconvolution methods.
/preprocessing/EXP4_preprocessing.ipynb
- Preprocess and QCs single-cell and bulk data from HGSOC.
/exploratory_experiments/EXP4_eval.ipynb
- NNLS deconvolution and analysis of dissociated and classic bulks.
/preprocessing/EXP4_preprocessing.ipynb
- Preprocess and QCs single-cell and bulk data from HGSOC.
/exploratory_experiments/EXP5_eval.ipynb
- NNLS deconvolution and analysis of dissociated and classic bulks, with added cells to reference.
/preprocessing/EXP3_pseudos_snadp.ipynb
- Preprocess and creates pseudobulks from white adipose tissue of single-nucleus and single-cell data.
/exploratory_experiments/EXP6_eval.ipynb
- NNLS deconvolution and analysis of real bulks, with single cell and single nucleus RNA seq as reference.
To run the experiments in this repository, you'll need to set up a Conda environment named env_ml
. The required configuration file, env_ml.yml
, is located in the "environment" folder of this repository.
Make sure you have Conda installed. If not, you can download and install it from Conda's official website.
Navigate to the root of the repository and run the following command to create the env_ml
environment:
conda env create -f environment/env_ml.yml