The Breast Sensitivity Signature Collection (SSc breast) contains 1,372 gene signatures that reflect the transcriptional differences between sensitive and resistant breast cancer cell lines before drug treatment. The SSc breast collection has been used in the publication "Spatial Transcriptomics in Breast Cancer Reveals Tumour Microenvironment-Driven Drug Responses and Clonal Therapeutic Heterogeneity" and it is available to download on Zenodo (DOI: 10.5281/zenodo.10638906).
This is a fork of cnio-bu/drug_susceptibility_collection repository, which is intended to compute a pan-cancer drug Sensitivity Signature Collection (SSc) for Beyondcell [1].
In order to obtain these signatures, we performed a differential expression analysis against the area under the curve (AUC) with limma v3.54.0 [2] for all compounds tested in at least 10 different breast cancer cell lines. We selected the top 250 up- and down-regulated genes in sensitive versus resistant cancer cell lines, ranked by the t-statistic, to create bidirectional gene signatures of 500 genes each. The AUC was used to measure drug response because, contrary to IC50, it can always be estimated without extrapolation from the dose-response curve and has shown more accuracy in predicting drug response [3].
Expression and drug response data were retrieved from three independent pharmacogenomics assays: the Cancer Therapeutics Response Portal (CTRP) v2 [4–6], the Genomics of Drug Sensitivity in Cancer (GDSC) v2 [7–9] and the PRISM [10,11] repurposing compendium through the DepMap portal v22Q4 [12]. As these sources are independent, several signatures refer to the same compound. Consequently, the 1,372 transcriptomic signatures that form the SSc breast reflect the predicted response to >1,200 drugs.
To verify that the cancer cell lines included in the signature generation analyses were, in fact, representative of breast cancer patients, we used the corrected lineage reported in the Celligner project [13], which provides a framework to align cancer cell lines to human tumours from large cohorts of human patients such as the Cancer Genome Atlas (TCGA).
References
- Fustero-Torre C, Jiménez-Santos MJ, García-Martín S, Carretero-Puche C, García-Jimeno L, Ivanchuk V, et al. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med. 2021;13:187.
- Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47.
- Jang IS, Neto EC, Guinney J, Friend SH, Margolin AA. Systematic assessment of analytical methods for drug sensitivity prediction from cancer cell line data. Pac Symp Biocomput. 2014;63–74.
- Basu A, Bodycombe NE, Cheah JH, Price EV, Liu K, Schaefer GI, et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell. 2013;154:1151–61.
- Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price EV, Coletti ME, et al. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer Discov. 2015;5:1210–23.
- Rees MG, Seashore-Ludlow B, Cheah JH, Adams DJ, Price EV, Gill S, et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat Chem Biol. 2016;12:109–16.
- Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483:570–5.
- Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2013;41:D955–61.
- Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, et al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell. 2016;166:740–54.
- Yu C, Mannan AM, Yvone GM, Ross KN, Zhang Y-L, Marton MA, et al. High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines. Nat Biotechnol. 2016;34:419–23.
- Corsello SM, Nagari RT, Spangler RD, Rossen J, Kocak M, Bryan JG, et al. Discovering the anti-cancer potential of non-oncology drugs by systematic viability profiling. Nat Cancer. 2020;1:235–48.
- Ghandi M, Huang FW, Jané-Valbuena J, Kryukov GV, Lo CC, McDonald ER 3rd, et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature. 2019;569:503–8.
- Warren A, Chen Y, Jones A, Shibue T, Hahn WC, Boehm JS, et al. Global computational alignment of tumor and cell line transcriptional profiles. Nat Commun. 2021;12:22.
The code in this repository is distributed as a snakemake pipeline. It extensively uses Snakemake's integration with the conda package manager to take care of software requirements and dependencies automatically.
First, install conda by following the installation instructions.
Next, use the git clone command to create a local copy:
git clone -b breast https://github.com/cnio-bu/drug_susceptibility_collection
And create a snakemake environment:
conda env create -f envs/snakemake.yaml
You need to modify the configuration files:
config.yaml
: Contains all pipeline parameters. Please specify the location of the output and log files.datasets.csv
: Please indicate the location of the datasets to be analysed.
All required datasets are public and can be downloaded from the indicated sites:
Dataset | File | Download from |
---|---|---|
raw_ccle_reads | OmicsExpressionGenesExpectedCountProfile.csv | DepMap (v22Q4) |
celligner_data | Celligner_info.csv | figshare |
ccle_sample_info | sample_info.csv | DepMap (v22Q2) |
prism_response_curves | secondary-screen-dose-response-curve-parameters.csv | DepMap (v19Q4) |
prism_treatment_info | secondary-screen-replicate-collapsed-treatment-info.csv | DepMap (v19Q4) |
gdsc_response_curves | GDSC2_fitted_dose_response_*.xlsx | GDSC (Accessed 25 February 2020) |
gdsc_compound_meta | gdsc_compound_meta.csv | GDSC (Accessed 25 February 2020) |
crispr_gene_dependency_chronos | CRISPRGeneDependency.csv | DepMap (v22Q4) |
ctrp_response_curves | CTRPv2.0_2015_ctd2_ExpandedDataset/v20.data.curves_post_qc.txt | DepMap (vCTRP CTD^2) |
ctrp_compound_meta | CTRPv2.0_2015_ctd2_ExpandedDataset/v20.meta.per_compound.txt | DepMap (vCTRP CTD^2) |
ctrp_cell_meta | CTRPv2.0_2015_ctd2_ExpandedDataset/v20.meta.per_cell_line.txt | DepMap (vCTRP CTD^2) |
ctrp_experiment_meta | CTRPv2.0_2015_ctd2_ExpandedDataset/v20.meta.per_experiment.txt | DepMap (vCTRP CTD^2) |
ccle_default_line | OmicsDefaultModelProfiles.csv | DepMap (v22Q4) |
hgnc_protein_coding | hgnc_gene_with_protein_product.tsv | HGNC (Accessed 22 March 2023) |
hallmarks | h.all.v2023.1.Hs.symbols.gmt | MSigDB |
Once the pipeline is configured, the user just needs to enter these commands:
conda activate snakemake
snakemake --use-conda -j 200
conda deactivate
The mandatory arguments are:
- --use-conda: To install and use the conda environments.
- -j: Number of threads/jobs provided to snakemake.
After all the jobs have been completed, the SSc breast collection can be found in resultspath/drug_signatures_classic.gmt
.
- Santiago García-Martín
- María José Jiménez-Santos
If you have any questions, feel free to submit an issue.