The goal of this package is to provide a framework for the benchmarking of tumor deconvolution algorithms specifically on proteomics data. To run the platform, please see our primary documentation site.
Here we describe how to contribute to the project. We employ a modular, containerized, framework written in the Common Workflow Language to enable plug-n-play assessment of novel algorithms as described in the image below.
Status of docker builds:
As a benchmarking platform, we constructed an architecture that enables others to contribute and add their own customization. While our documentation site has information on how to run the platform, this page focuses on how to contribute.
Once you have written an algorithm, we require first a script to run the algorithm, and then integration into our larger script.
To add a tumor deconvolution algorithm this platform requires two inputs:
- An expression matrix
- A cell type matrix
As such we recommend building a Docker container that runs your algorithm together with a Common Workflow Language script to run the algorithm with the two inputs (labeled expression
and signature
. The expected output is a matrix called deconvoluted
.
Once you have a script that can run, you can modify the run-deconv.cwl
script in the ./tumorDeconvAlgs directory. This script takes the same parameters as the script described above but also an additional parameter called alg
.
Once this is complete, you should be able to run a test command such as
cwltool https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/prot-deconv.cwl --cancer hnscc --protAlg [yourAlgNameHere] --sampleType tumor --signature LM7c
Once this test script can run, you can create a pull request from your fork.
It is important to test new signature matrices as they evolve, and therefore we created a separate module to enable the creation of custom cell-type matrices.
The easiest way to add a custom signature matrix is to copy a weighted matrix into the ./signature_matrices directory. The rows of the matrix represent gene names (the first column should be an HGNC gene name) and the columns represent cell types. Once the docker image rebuilds with this file in the directory, it can be called by the cwl
script.
We also try to keep our documentation site up to date. If you have any updates to this, please create a pull request with updates to the docs/index.md page.
To implement your algorithm in this framework, you will need a CWL engine and Docker installed.