A neural network model for prediction of amino-acid probability from a protein backbone structure.
- pytorch
- numpy
- pandas
- tqdm
To install gcndesgn through pip
pip install gcndesign
from gcndesign.predictor import Predictor
gcndes = Predictor(device='cpu') # 'cuda' can also be applied
gcndes.predict(pdb='pdb-file-path') # returns list of amino-acid probabilities
gcndesign_predict.py
To predict amino-acid probabilities for each residue-site
gcndesign_predict.py YOUR_BACKBONE_STR.pdb
gcndesign_autodesign.py
To design 20 sequences in a completely automatic fashion
gcndesign_autodesign.py YOUR_BACKBONE_STR.pdb -n 20
For more detailed usage, please run the following command
gcndesign_autodesign.py -h
Note
The gcndesign_autodesign script requires pyrosetta software. Installation & use of pyrosetta must be in accordance with their license.
- gcndesign_autodesign.py: PyRosetta
Note
A critical issue has fixed and the parameters were re-trained on a new dataset (CATH v4.3 S40 dataset). This change has stabilized the prediction, but has not been reflected in the document above. So there are inaccuracies in the description and figures.
The dataset used for training GCNdesign is available here
- dataset.tar.gz: Training/T500/TS50 dataset
- dataset_cath40.tar.bz2: CATH-v4.3 S40 dataset (used for the latest parameter training)
Distributed under MIT license.
The author was supported by Grant-in-Aid for JSPS Research Fellows (PD, 17J02339). Koga Laboratory of Institutes for Molecular Science (NINS, Japan) has provided a part of the computational resources. Koya Sakuma (yakomaxa) gave a critical idea for neural net architecture design in a lot of deep discussions. Naoya Kobayashi (naokob) created excellent applications to help broader needs, ColabGCNdesign and FolditStandalone_Sequence_Design.