Paralog matching method described in “Pairing interacting protein sequences using masked language modeling” (Lupo, Sgarbossa, and Bitbol, 2023). The MSA Transformer model used here was introduced in (Rao el al, 2021).
Clone this repository on your local machine by running and move inside the root folder. We recommend creating and activating a dedicated conda or virtualenv Python virtual environment.
git clone git@github.com:Bitbol-Lab/DiffPALM.git
and move inside the root folder. We recommend creating and activating a dedicated conda or virtualenv Python virtual environment. Then, make an editable install of the package:
python -m pip install -e .
See the
_example_prokaryotic.ipynb
notebook for an example of paired MSA optimization in the case of
well-known prokaryotic datasets, for which ground truth matchings are
given by genome proximity.
Our work can be cited using the following BibTeX entry:
@article{lupo2023pairing,
title={Pairing interacting protein sequences using masked language modeling},
author={Lupo, Umberto and Sgarbossa, Damiano and Bitbol, Anne-Florence},
year={2023},
journal={bioRxiv},
doi={10.1101/2023.08.14.553209 }
}
This project has been developed using nbdev.