Pallatom is an innovative protein generation model that produces protein structures with all-atom coordinates. By learning and modeling the joint distribution
To set up the environment for running Pallatom, follow these steps:
-
Create and activate a conda environment:
conda create --name pallatom python=3.7.16 conda activate pallatom
-
Install JAX:
First, install the specific version of JAX needed for this project:
pip install jax==0.3.25 pip install "jax[cuda]"==0.3.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
-
Install other dependencies:
Finally, install the additional required packages from
requirements.txt
:pip install -r requirements.txt
If you encounter compatibility issues with higher CUDA versions, JAX 0.3.25, and Python 3.7, we offer the following solution using Python 3.10 and JAX with CUDA 12.6:
Create and activate a conda environment:
conda create --name pallatom python=3.10
conda activate pallatom
Install basic dependencies:
pip install biopython==1.79 dm-tree==0.1.8 chex==0.1.86 dm-haiku==0.0.12 dm-tree==0.1.8 immutabledict==2.0.0 ml-collections==0.1.0 numpy==1.24.3 pandas==2.0.3 scipy==1.11.1 tensorflow-cpu==2.16.1 rdkit einops tqdm
Install JAX with CUDA support:
pip install "jax[cuda]"==0.4.34 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
To run the Pallatom model sampling process, use the pallatom.py
script. Below is an example of how to use the script with command-line arguments:
python pallatom.py --savepath ./results --L 100 --cuda_devices 0 --t_min 0.01 --t_max 1.0 --gamma 0.2 --step_scale 2.25 --T 200 --rounds 10
data_dir
: Directory where model parameters are stored (default:./
)model_name
: Name of the model to use (default:Pallatom
)savepath
: Directory where results will be saved (default:./results
)L
: Length of the sequence to sample (default:120
)batch_num
: Number of batches to run (default:4
)cuda_devices
: CUDA visible device (default:0
)t_min
: Minimum noise level foradd_noise_level
(default:0.01
)t_max
: Maximum noise level foradd_noise_level
(default:1.0
)gamma
: Gamma value foradd_noise_level
(default:0.2
)step_scale
: Scale of the step (default:2.25
)T
: Number of steps for the sampling process (default:200
)rounds
: Number of rounds to run (default:1
)
The results, including the generated sequences in FASTA format and protein structures in PDB format, will be saved in the specified savepath
directory.
If you find Pallatom useful in your research, please consider citing our work:
@article {Qu2024.08.16.608235,
author = {Qu, Wei and Guan, Jiawei and Ma, Rui and Zhai, Ke and Wu, Weikun and Wang, Haobo},
title = {P(all-atom) Is Unlocking New Path For Protein Design},
year = {2024},
doi = {10.1101/2024.08.16.608235},
journal = {bioRxiv}
}
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.