UPDATES |
---|
2024.8.15 : chiLife 1.0 released! |
2024.8.15 : chiLife can now be used to create arbitrary peptides with natural and NCAAs. Checkout the make_peptide function. |
2024.8.15 : chiLife can make NCAA structures from smiles. Checkout smiles2residue . Note: Requires RDKit to be installed. |
chiLife now supports arbitrary backbone attachments including DNA and RNA labels and more! |
chiLife is a Python package for modeling non-canonical amino acid side chain ensembles and using those ensembles to
predict experimental observables. Currently, it is focused primarily on site-directed spin labels (SDSLs). The goal of
chiLife is to provide a simple, flexible and interoperable Python interface to protein side chain ensemble modeling,
allowing for rapid development of custom analysis and modeling pipelines. This is facilitated by the use of RotamerEnsemble
and SpinLabel
objects with standard interfaces for all supported side chain types, side chain modeling methods and
protein modeling methods. Flexibility is achieved by allowing users to create and use custom RotamerEnsemble
and
SpinLabel
objects as well as custom side chain modeling methods. Interoperability sought by interactions with other
Python-based molecular modeling packages. This enables the use of experimental data, like double electron-electron
resonance (DEER), in other standalone protein modeling applications that allow user-defined restraints, such as
PyRosetta and NIH-Xplor.
Stable distributions of chiLife can be installed using pip
.
pip install chiLife
Alternatively, the development version can be installed by downloading and unpacking the GitHub repository, or using
git clone
followed by a standard Python setuptools installation.
git clone https://github.com/StollLab/chiLife.git
cd chiLife
pip install -e . # Install as editable and update using `git pull origin main`
The central entity of chiLife is the SpinLabel
object, which inherits from the more abstract RotamerEnsemble
object. While most people will primarily use SpinLabel
objects, be aware that most properties and functions
discussed are also functional on RotamerLibrary
objects as well. SpinLabel
objects can be created and "attached" to
protein models easily and quickly, allowing for scriptable analysis or on-the-fly simulation of distance distributions while modeling.
Attaching a SpinLabel
to a protein does not alter the protein in any way, allowing the protein model to retain the native amino acid.
import numpy as np
import matplotlib.pyplot as plt
import chilife as xl
# Download protein structure from PDB
MBP = xl.fetch('1omp', save=True)
# Create Spin lables
SL1 = xl.SpinLabel('R1M', site=20, chain='A', protein=MBP)
SL2 = xl.SpinLabel('R1M', site=238, chain='A', protein=MBP)
# Calculate distribution
r = np.linspace(0, 100, 256)
P = xl.distance_distribution(SL1, SL2, r=r)
# Plot distribution
fig, ax = plt.subplots(figsize=(6, 3))
ax.plot(r, P)
ax.set_yticks([])
ax.set_xlabel('Distance ($\AA$)')
for spine in ['left', 'top', 'right']:
ax.spines[spine].set_visible(False)
plt.show()
The side chain ensembles can then be saved using a simple save
function that accepts an arbitrary number of RotamerEnsemble
, SpinLabel
, MDAnalyisis.Universe
and MDAnalyiss.AtomGroup
objects. Because
RotamerEnsemble
/SpinLabel
objects do not mutate the underlying protein, they are saved as separate multi-state
objects and can be visualized with applications like PyMOL. If you do wish to permanently alter the underlying protein
structure, you can use the mutate
function described below.
# Save structure
xl.save('MBP_L20R1_S238R1.pdb', SL1, SL2, MBP)
One of the benefits of chiLife is the variety and customizable nature of spin label modeling methods. This includes
methods to sample side chain conformations that deviate from canonical dihedral angles and fixed rotamer libraries
(off-rotamer sampling) and methods to repack a SpinLabel
and its neighboring amino acids.
import chilife as xl
MBP = xl.fetch('1omp')
# Create a SpinLabel object using the MTSSLWizard 'Accessible Volume' Approach
SL1 = xl.SpinLabel.from_wizard('R1M', site=20, chain='A', protein=MBP)
# Create a SpinLabel object by sampling off-rotamer dihedral conformations using the rotamer library as a prior
SL2 = xl.SpinLabel('R1M', site=238, chain='A', sample=2000, protein=MBP)
# Create a SpinLabel object from a ProEPR.repack trajectory
traj, de = xl.repack(SL1, SL2, protein=MBP)
The repack
function performs a Markov chain Monte Carlo sampling (MCMC) repack of the spin labels, SL1
and SL2
and
neighboring side chains, returning an MDAnalysis.Universe
object containing all accepted structures of the MCMC
trajectory, the energy function changes at each acceptance step and new SpinLabel objects attached to the lowest-energy
structure of the trajectory.
SpinLabel objects and neighboring side chains can be repacked using off-rotamer sampling by using the off_rotamer=True
option. In the event off-rotamer sampling is being used for repacking, it is likely that the desired SpinLabel
object is
not the default rotamer ensemble attached to the lowest-energy structure, but instead the ensemble of side chains
created in the MCMC sampling trajectory. This can be done using the from_trajectory
class method.
# Create a SpinLabel object from a xl.repack trajectory with off-rotamer sampling
traj, de = xl.repack(SL1, SL2, protein=MBP, off_rotamer=True)
SL1 = xl.SpinLabel.from_trajectory(traj, site=238)
Note: if you are creating a SpinLabel object from a label that is unknown to chilife you will have to specify which atoms the spin density primarily resides on. this is done with the
spin_atoms
kwarg, e.g.SL1 = xl.SpinLabel.from_trajectory(traj, site=238, spin_atoms=['N1', 'O1'])
When repacking, off-rotamer sampling can be controlled for each dihedral angle separately by passing a list of bools to
the off_rotamer keyword. For example, passing off_rotamer = [False, False, False, True, True]
will allow for off-rotamer
sampling of only χ4 and χ5.
Sometimes you don't want an entire rotamer ensemble, but just a protein structure mutated at a particular site with
the most probable spin label conformer. This can be done easily with the mutate
function.
import chilife as xl
MBP = xl.fetch('1omp')
SL = xl.SpinLabel('R1M', 238, protein=MBP)
MBP_S238R1 = xl.mutate(MBP, SL)
xl.save('MBP_S238R1.pdb', MBP_S238R1)
chiLife can mutate several sites at once, and it can mutate canonical amino acids as well.
SL1 = xl.SpinLabel('R1M', 20, protein=MBP)
SL2 = xl.SpinLabel('R1M', 238, protein=MBP)
L284V = xl.RotamerEnsemble('VAL', 284, protein=MBP)
Mutating adjacent sites is best done with the repack
function to avoid clashes between SpinLabels
/RotamerEnsembles
.
This returns a trajectory which can be used to pick the last or lowest-energy frame as the mutated protein.
MBP_L284V_L20R1_S238R1, _, _ = xl.repack(SL1, SL2, L284V, protein=MBP)
Site-directed spin labels, and other non-canonical amino acids, are constantly being developed. Additionally, rotamer libraries for existing labels continuously undergo incremental improvements or modification to suit particular needs, e.g. a rotamer library specifically for transmembrane residues. In fact chiLife itself can be used to develop new and improved, or application-specific rotamer libraries. To this end, chiLife makes it easy to create user-defined spin labels and custom rotamer libraries. To create a custom rotamer library, all that is needed is (1) a pdb file of the spin label, (2) a list of the rotatable dihedral bonds, and (3) a list of the atoms that carry spin density.
xl.create_library(name='TRT_1.0',
resname='TRT',
pdb='test_data/trt.pdb',
dihedral_atoms=[['N', 'CA', 'CB', 'SG'],
['CA', 'CB', 'SG', 'SD'],
['CB', 'SG', 'SD', 'CAD'],
['SG', 'SD', 'CAD', 'CAE'],
['SD', 'CAD', 'CAE', 'OAC']],
spin_atoms='CAQ')
This function creates a portable TRT_1.0_rotlib.npz
file that can be provided when creating a SpinLabel
object.
xl.SpinLabel('TRT', site=238, protein=MBP, rotlib='TRT_1.0', sample=5000)
Thus, the file can be easily shared with coworkers, collaborators or with other chiLife users.
NOTE: In the above example, the
rotlib
keyword is only used for demonstration purposes. chiLife always searches the current working directory for rotamer library files first. If there is aXYZ_rotlib.npz
in the working directory and you specifyxl.SpinLabel('XYZ', ...)
, chiLife will assume you want to use theXYZ_rotlib.npz
rotamer library.
User-defined labels can be constructed from a single-state pdb file or a multi-state PDB file. If constructed from a
single-state pdb file, a list of dihedral angles and weights can be passed via the dihedrals
and weigts
keyword
arguments. For each set of dihedral angles, chiLife create a rotamer and store the whole library using the specified
name. Alternatively, using a multi-state PDB file can add some additional information, such as isomeric heterogeneity of
the rotamer library.
For more information on how to use chiLife see examples and the workshop repository.
When you are using chiLife in your work, please cite:
Tessmer, M.H.; Stoll, S. chiLife: An open-source Python package for in silico spin labeling and integrative protein modeling. Plos Comput Biol. 2023, 19:e1010834. https://doi.org/10.1371/journal.pcbi.1010834
If you are using off-rotamer sampling, please cite:
Tessmer, M.H.; Canarie, E.R.; Stoll, S. Comparative evaluation of spin label modeling methods for protein structural studies. Biophys J. 2022, 121, 3508-3519. https://doi.org/10.1016/j.bpj.2022.08.002
When using bifunctional label modeling, please cite:
Tessmer, M.H.; Stoll, S. A Rotamer Library Approach to Modeling Side Chain Ensembles of the Bifunctional Spin Label RX. Appl. Magn. Reson. 2023, 55, 127–140. https://doi.org/10.1007/s00723-023-01576-1
Hasanbasri, Z.; Tessmer, M.H.; Stoll, S.; Saxena, S. Modeling of Cu(ii)-based protein spin labels using rotamer libraries. Phys. Chem. Chem. Phys. 2024, 26, 6806-6816. https://doi.org/10.1039/D3CP05951K
Note than many rotamer libraries have their own references. Please use the chilfe.rotlib_info()
function
on the rotamer libraries to check if there are any additional citations that should be referenced.