The BiGMeC pipeline makes a draft reconstruction of the metabolic pathway associated with a non-ribosomal peptide synthetase (NRPS) or polyketide synthase (PKS) biosynthetic gene cluster. The pipeline takes an individual region Genbank (.gbk) file produced by antiSMASH and produces a JSON-file that can be readily incoporporated into a genome-scale metabolic model using available software such as cobrapy or COBRA Toolbox. The pipeline leverages the genome-scale metabolic model (GEM) of S. coelicolor (Sco-GEM) as a database for reactions and metabolites. Read more about BiGMeC in the research paper published in BMC Bioinformatics, February 2021
- python 3 (>=3.5).
- conda or virtualenv, if you want to run the pipeline in a virtual environment (recommended).
- pip package manager. Neccessary to install required python packages.
The BiGMeC pipeline doesn't require any installation, you can simply clone the repository, install a few python packages and immediately run the program.
In your command line interface, change to your preferred directory and run
git clone https://github.com/AlmaasLab/BiGMeC.git
It is recommend to run the pipeline in a virtual environment. You can create this environment with both virtuealenv or conda, but here we only show how to do it with conda. From the BiGMeC repository create the environment by running:
conda create -n bigmec python=3.6
Activate the new environment:
conda activate bigmec
Finally, use pip to install to the required packages listed in requirements.txt:
pip install -r requirements.txt
To test the pipeline you can simply run:
python bigmec.py
This will run the pipeline for the MIBiG gene cluster BGC0000001 using the corresponding antiSMASH result file Data/mibig/1.gbk. The result will be stored in Data/constructed_pathways/1.json.
To use your own antiSMASH result and store the results in a specified folder run:
python bigmec.py -f antismash_data_filename_or_folder -o output_folder
Note that antiSMASH provides both a complete and region-specific GenBank-files, and it is the region-specific files that should be used as input for the BiGMeC pipeline. Further information is provided by running
python bigmec.py -h
The constructed pathways can be added to a GEM immediately by using the --add-to-model
specifing the the path to the GEM (in SBML format) that the pathway should be appended to. Not that the model must use the BiGG namespace for this to work. E.g. if you want to add one pathway (or all pathways in the folder) to the E. coli model iML1515 you can do so by running
python bigmec.py -f antismash_data_filename_or_folder -o output_folder --add-to-model path/to/iML1515.xml
The current version of BiGMeC uses the S. coelicolor model Sco-GEM as a library of metabolites and reactions. The pipeline can in principle use another GEM as the reference, but this feature is not yet tested properly. Note that you don't need to use a different reference model to construct pathways to be used with another GEM than Sco-GEM
- Snorre Sulheim (@sulheim), SINTEF Industry, Norway / Norwegian University of Science and Technology, Norway
- Fredrik Aunaas Fossheim(@FredrikFossheim), Norwegian University of Science and Technology, Norway
Contributions are very welcome, either by raising issues or through pull requests.
If you use the BiGMeC software, please cite us:
Sulheim, S., Fossheim, F.A., Wentzel, A. & Almaas, E. Automatic reconstruction of metabolic pathways from identified biosynthetic gene clusters. BMC Bioinformatics 22, 81 (2021). https://doi.org/10.1186/s12859-021-03985-0