Scaffold constrained generation

Implementation of the methods described in:

We build on existing Recurrent Neural Network models for SMILES to perform scaffold constrained generation. Scaffold constrained generation and optimization is not a well studied problem, yet when dealing with drug discovery projects (especially in late stage optimization of compounds) it is the problem we are trying to solve.

Reproducing the results of the paper

There are four different notebooks:

Distribution_learning_benchmarks.ipynb to reproduce results for distribution learning related benchmarks, on SureChEMBL and on DRD2
Focused_learning_experiments.ipynb to reproduce the figure showing the results of focused learning
MMP12_experiments.ipynb that shows the results on the MMP 12 experiments
Minimal_working_example.ipynb is a starting point for using the model with a given scaffold

To reproduce the results of a given notebook, simply launch the notebook and run all cells.

Why build on REINVENT's codebase?

This repository is a fork from the Reinvent's codebase (https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0235-x). One of the major advantages of the algorithms proposed in our work on scaffold constrained generation is that we build on existing models. To go through with this idea of building on existing solutions rather than designing a totally new system, we provide code for our method that builds on an existing, popular codebase: https://github.com/MarcusOlivecrona/REINVENT. This way people who are already familiar with this codebase can get started very easily instead of having to learn how a totally new repository works. We thank the authors of the original paper for providing a clear codebase for reproducing their work and building on it.

Requirements

This package requires:

Python 3.6
PyTorch >= 0.4.1
RDkit
Scikit-Learn (for QSAR scoring function)
tqdm (for training Prior)

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
Vizard		Vizard
__pycache__		__pycache__
data		data
images		images
licenses		licenses
.gitignore		.gitignore
Distribution_learning_benchmarks.ipynb		Distribution_learning_benchmarks.ipynb
Focused_learning_experiments.ipynb		Focused_learning_experiments.ipynb
MMP12_experiments.ipynb		MMP12_experiments.ipynb
Minimal_working_example.ipynb		Minimal_working_example.ipynb
README.md		README.md
data_structs.py		data_structs.py
fpscores.pkl.gz		fpscores.pkl.gz
hill_climbing.py		hill_climbing.py
main.py		main.py
model.py		model.py
multiprocess.py		multiprocess.py
scaffold_constrained_model.py		scaffold_constrained_model.py
scoring_functions.py		scoring_functions.py
train_agent.py		train_agent.py
train_prior.py		train_prior.py
utils.py		utils.py
vizard_logger.py		vizard_logger.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scaffold constrained generation

Reproducing the results of the paper

Why build on REINVENT's codebase?

Requirements

About

Releases

Packages

Languages

maxime-langevin/scaffold-constrained-generation

Folders and files

Latest commit

History

Repository files navigation

Scaffold constrained generation

Reproducing the results of the paper

Why build on REINVENT's codebase?

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages