SeNMo: Self-Normalizing Foundation Model for Enhanced Multi-Omics Data Analysis in Oncology

Overview

SeNMo is a deep learning model designed to enhance the analysis of multi-omics data in oncology. This repository contains the code and instructions for training, fine-tuning, testing, and ensemble testing of the SeNMo model.

Requirements

Ensure you have all the necessary dependencies installed. You can install them using the requirements.txt file:

pip install -r requirements.txt

Running the SeNMo Training

To train the SeNMo model with the specified parameters, use the following command:

python SeNMo_Training.py \
    --regression True \
    --finetune False  \
    --exp_name surv \
    --act_type None \
    --reg_type all \
    --disease pancancer_combined \
    --task surv \
    --gpu_ids 0 \
    --lr 0.0005811726189177087 \
    --weight_decay 0.005978947728252338 \
    --dropout_rate 0.10583716299176746 \
    --batch_size 256 \
    --dataroot <path_to_data> \
    --checkpoints_dir <path_to_checkpoints> \
    --input_size_omic 80697

Running the SeNMo Finetuning

To finetune the SeNMo model with the specified parameters, use the following command:

python SeNMo_Training.py \
    --regression True \
    --finetune True  \
    --exp_name surv \
    --reg_type all \
    --act_type None \
    --disease pancancer_indl_cancers \
    --task surv \
    --gpu_ids 0 \
    --lr 0.000040189177087 \
    --weight_decay 0.35 \
    --dropout_rate 0.35 \
    --batch_size 16 \
    --niter_decay 8 \
    --dataroot <path_to_data> \
    --checkpoints_dir <path_to_checkpoints> \
    --pretrained_model_dir <path_to_pretrained_model> \
    --cancer 'CPTAC-LUSC' \
    --frozen_layers 0 \
    --input_size_omic 80697

Running the SeNMo Testing

To test the SeNMo model with the specified parameters, use the following command:

python SeNMo_Testing.py \
    --regression True \
    --exp_name surv \
    --reg_type all \
    --act_type None \
    --disease pancancer_combined \
    --task surv \
    --gpu_ids 0 \
    --lr 0.0005811726189177087 \
    --weight_decay 0.005978947728252338 \
    --dropout_rate 0.10583716299176746 \
    --batch_size 1 \
    --niter_decay 8 \
    --dataroot <path_to_data> \
    --checkpoints_dir <path_to_checkpoints> \
    --pretrained_model_dir <path_to_pretrained_model> \
    --frozen_layers 0 \
    --input_size_omic 80697

Running the SeNMo Ensemble Testing

To perform ensemble testing with the SeNMo model, use the following command:

python SeNMo_Ensemble.py \
    --regression True \
    --exp_name surv \
    --reg_type all \
    --act_type None \
    --disease pancancer_combined \
    --task surv \
    --gpu_ids 0 \
    --lr 0.0005811726189177087 \
    --weight_decay 0.005978947728252338 \
    --dropout_rate 0.10583716299176746 \
    --batch_size 1 \
    --dataroot <path_to_data> \
    --checkpoints_dir <path_to_checkpoints> \
    --pretrained_model_dir <path_to_pretrained_model> \
    --input_size_omic 80697

Checkpoints

You can find the pre-trained model checkpoints here, and separately in two parts part-1 and part-2.

Embeddings

Access the TCGA molecular embeddings here, and separately here.

Paper

For detailed information, refer to the arXiv paper.

Future_Work

in progress...creating class packages for CLI pre-processing and generating embeddings at inference time.

Link to MINDS dataset
Link to HoneyBee

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
package_classes		package_classes
LICENSE		LICENSE
README.md		README.md
SeNMo_Ensemble.py		SeNMo_Ensemble.py
SeNMo_Testing.py		SeNMo_Testing.py
SeNMo_Training.py		SeNMo_Training.py
SeNMo_pre_and_postprocessing_allfeatures.ipynb		SeNMo_pre_and_postprocessing_allfeatures.ipynb
fig1-design.png		fig1-design.png
patients_embeddings.parquet		patients_embeddings.parquet
requirements.txt		requirements.txt
work-layout1.png		work-layout1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SeNMo: Self-Normalizing Foundation Model for Enhanced Multi-Omics Data Analysis in Oncology

Overview

Table of Contents

Requirements

Running the SeNMo Training

Running the SeNMo Finetuning

Running the SeNMo Testing

Running the SeNMo Ensemble Testing

Checkpoints

Embeddings

Paper

Future_Work

About

Releases

Packages

Languages

License

lab-rasool/SeNMo

Folders and files

Latest commit

History

Repository files navigation

SeNMo: Self-Normalizing Foundation Model for Enhanced Multi-Omics Data Analysis in Oncology

Overview

Table of Contents

Requirements

Running the SeNMo Training

Running the SeNMo Finetuning

Running the SeNMo Testing

Running the SeNMo Ensemble Testing

Checkpoints

Embeddings

Paper

Future_Work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages