SeNMo is a deep learning model designed to enhance the analysis of multi-omics data in oncology. This repository contains the code and instructions for training, fine-tuning, testing, and ensemble testing of the SeNMo model.
- Requirements
- Running the SeNMo Training
- Running the SeNMo Finetuning
- Running the SeNMo Testing
- Running the SeNMo Ensemble Testing
- Checkpoints
- Embeddings
- Paper
- Future Work
Ensure you have all the necessary dependencies installed. You can install them using the requirements.txt
file:
pip install -r requirements.txt
To train the SeNMo model with the specified parameters, use the following command:
python SeNMo_Training.py \
--regression True \
--finetune False \
--exp_name surv \
--act_type None \
--reg_type all \
--disease pancancer_combined \
--task surv \
--gpu_ids 0 \
--lr 0.0005811726189177087 \
--weight_decay 0.005978947728252338 \
--dropout_rate 0.10583716299176746 \
--batch_size 256 \
--dataroot <path_to_data> \
--checkpoints_dir <path_to_checkpoints> \
--input_size_omic 80697
To finetune the SeNMo model with the specified parameters, use the following command:
python SeNMo_Training.py \
--regression True \
--finetune True \
--exp_name surv \
--reg_type all \
--act_type None \
--disease pancancer_indl_cancers \
--task surv \
--gpu_ids 0 \
--lr 0.000040189177087 \
--weight_decay 0.35 \
--dropout_rate 0.35 \
--batch_size 16 \
--niter_decay 8 \
--dataroot <path_to_data> \
--checkpoints_dir <path_to_checkpoints> \
--pretrained_model_dir <path_to_pretrained_model> \
--cancer 'CPTAC-LUSC' \
--frozen_layers 0 \
--input_size_omic 80697
To test the SeNMo model with the specified parameters, use the following command:
python SeNMo_Testing.py \
--regression True \
--exp_name surv \
--reg_type all \
--act_type None \
--disease pancancer_combined \
--task surv \
--gpu_ids 0 \
--lr 0.0005811726189177087 \
--weight_decay 0.005978947728252338 \
--dropout_rate 0.10583716299176746 \
--batch_size 1 \
--niter_decay 8 \
--dataroot <path_to_data> \
--checkpoints_dir <path_to_checkpoints> \
--pretrained_model_dir <path_to_pretrained_model> \
--frozen_layers 0 \
--input_size_omic 80697
To perform ensemble testing with the SeNMo model, use the following command:
python SeNMo_Ensemble.py \
--regression True \
--exp_name surv \
--reg_type all \
--act_type None \
--disease pancancer_combined \
--task surv \
--gpu_ids 0 \
--lr 0.0005811726189177087 \
--weight_decay 0.005978947728252338 \
--dropout_rate 0.10583716299176746 \
--batch_size 1 \
--dataroot <path_to_data> \
--checkpoints_dir <path_to_checkpoints> \
--pretrained_model_dir <path_to_pretrained_model> \
--input_size_omic 80697
You can find the pre-trained model checkpoints here, and separately in two parts part-1 and part-2.
Access the TCGA molecular embeddings here, and separately here.
For detailed information, refer to the arXiv paper.
in progress...creating class packages for CLI pre-processing and generating embeddings at inference time.
- Link to MINDS dataset
- Link to HoneyBee