This repository contains necessary infrastructure elements for adding and updating docker and singularity images for models and model groups in Kipoi model zoo. These images are pre-activated with a compatible conda environment where all the model (group) specific dependencies have been installed.
Example usage of kipoi
kipoi env create Basset
source activate kipoi-Basset
kipoi predict Basset \
--dataloader_args='{"intervals_file": "example/intervals_file", "fasta_file": "example/fasta_file"}' \
-o 'Basset.example_pred.tsv'
Kipoi uses conda for creating model specific environments.
- It is impossible to gurantee that
kipoi env create Basset
resolves in every operating system since conda is not operating system agnostic. - It is cumbersome, labor intensive and error prone to pin all model dependencies across 31 and counting model groups in kipoi.
- There is no gurantee that even if the dependencies are getting resolved now, they will continue to be resolved in future since the universe of python dependencies are ever changing.
Software containers were invented to handle exactly these problems by making a snapshot of a working system. We use both docker and singularity to make the containers as generalized as possible all the while remaining high performance computing cluster friendly.
kipoi predict Basset \
--dataloader_args='{"intervals_file": "example/intervals_file", "fasta_file": "example/fasta_file"}' \
-o 'Basset.example_pred.tsv' \
--singularity
Note: There is no need to create a separate environment as the container comes pre-installed with the model specific conda environment.
docker run -v $PWD/app/ kipoi/kipoi-docker:sharedpy3keras2tf2
kipoi predict Basset \
--dataloader_args='{'intervals_file': '/app/intervals.bed',
'fasta_file': '/app/ref.fa'}' \
-o '/app/Basset.example_pred.tsv'
-
Docker images are hosted in dockerhub.
-
Singularity images are hosted in zenodo.
-
Model specific docker and singularity image information and example usage are located under
docker
andsingularity
tab in each model's webpage at kipoi website such as here.
-
python>=3.9
-
Install docker
-
Install singularity
-
Install kipoi_containers using
pip install -e .
-
DOCKER_USERNAME
,DOCKER_PASSWORD
- Only required for pushing the image to kipoi/kipoi-docker
- Get it here
-
ZENODO_ACCESS_TOKEN
- Required for updating and pushing singularity images to zenodo using its rest api
- Get it here. Make sure to check deposit:actions and deposit:write
-
GITHUB_TOKEN
- Required for syncing with Kipoi model zoo
- Get it here. Make sure to add both read and write access
-
SINGULARITY_PULL_FOLDER
(Optional)- If specified, singularity images will be downloaded, built into and pushed from this folder. Otherwise, the current working directory is chosen as default.
-
Docker: here
- This maps models (groups) to a docker images. Each value here refers to a dockerhub image which can be pulled using docker cli/api.
-
Singularity: here
- Each entry here has three keys
- url: A globally accessible url for the image
- name: Name of the image without any extension
- md5: A md5 checksum used to ensure integrity during download
- Each entry here has three keys
As models get added and updated in the model repo, the respective docker and singularity containers should also be added and updated along with various json files in kipoi_contaners/container-info
and github workflows in .github/workflows
. Execute this as follows -
python kipoi_containers/updateoradd.py
If everything is succesfull kipoi_containers/kipoi-model-repo-hash
will be updated to the most recent commit on the master branch of the model repo.
docker pull kipoi/kipoi-docker:mmsplice
pytest test-docker test-singularity test-containers/test_update_all_singularity_images.py
Currently, there are two ways to test the docker and singularity images along with the models.
-
Test model(s) at a time or model group(s) if it contains only one model within their respective docker and singularity containers
pytest test-containers/test_models_from_command_line.py --model=KipoiSplice/4,Basenji
-
Test any docker image which tests all compatible models or with a specific model group.
-
pytest test-containers/test_containers_from_command_line.py --image=kipoi/kipoi-docker:sharedpy3keras2tf1
-
pytest test-containers/test_containers_from_command_line.py --image=kipoi/kipoi-docker:sharedpy3keras2tf2 --modelgroup=HAL
There are three different workflows at .github/workflow, each of which serves a different purpose. The necessary secrets and workflows are described below.
For a quick howto look here
DOCKERUSERNAME
andDOCKERPASSWORD
- Correspond to values of env variables
DOCKER_USERNAME
andDOCKER_PASSWORD
respectively
- Correspond to values of env variables
ZENODOACCESSTOKEN
- Corresponds to value of env variable
ZENODO_ACCESS_TOKEN
- Corresponds to value of env variable
GITHUBPAT
- Corresponds to value of env variable
GITHUB_TOKEN
- Corresponds to value of env variable
-
Continuous integration
- Which
.github/workflows/test-images.yml
- When
- Push to any branch and pr to main branch in this repo
- Why
kipoi_containers
package is tested by this workflow
- How
- The package is built from scratch and tests specified in
Tests
section get executed. Additionally, one model from every model group gets tested within its docker and singularity containers.
- The package is built from scratch and tests specified in
- Which
-
Sync with Kipoi model repo
-
Which
.github/workflows/sync-with-model-repo.yml
-
When
- On demand and when a pull request is merged to master branch of model repo from here
-
Why
- Keep the docker and singularity images up to date with the model definition in the model repo
-
How
- Update existing images on dockerhub and zenodo if the model definiton has been updated
- Add new images if new model has been added to the model repo
- Create a new branch in model repo named
target-json
if it already does not exist - Update
shared/containers/model-to-singularity.json
in branchtarget-json
of model repo if a singularity image has been updated in zenodo. - Update jsons in
kipoi_containers/container-info/
andshared/containers/
in branchtarget-json
of model repo in case a new model has been added - Update workflows in
.github/workflow
in case a new model has been added - Update
kipoi_containers/kipoi-model-repo-hash
- A pr is created automatically which then needs to be reviewed and merged.
-
-
Build, test and push all docker and singularity images
-
Which
.github/workflows/release-workflow.yml
-
When
- On demand and when a new package of kipoi is released to pypi from here
-
Why
- Re-build, test and push the docker and singularity images. Some example scenarios -
- kipoi pypi package has been updated
- A new version has been released for
continuumio/miniconda3:latest
- Re-build, test and push the docker and singularity images. Some example scenarios -
-
How
- Re-build, test and push the dockerhub images. Docker cli is used for this purpose.
- A new version of the singularity image will be built based on the new docker image. A new version of the existing deposition on zenodo will be created and this modified image will be uploaded there. Finally, this new deposition will be pushed an url will be returned.
- New url will be updated in
shared/containers/model-to-singularity.json
in branchtarget-json
of model repo
-