This algoithm is part of the vantage6 solution. Vantage6 allowes to execute computations on federated datasets. This repository provides a boilerplate for new algorithms.
First clone the repository.
# Clone this repository
git clone https://github.com/jaspersnel/v6-boilerplate-py
Rename the contained v6_boilerplate_py directory to something that fits your algorithm, we use the convention v6_{name}_{langauge}
. Then you can edit the following files:
Update the ARG PKG_NAME=...
to the name of your algorithm (preferable the same as the directory name).
Determine which license suits your project.
In order for the Docker image to find the methods the algorithm needs to be installable. Make sure the name matches the ARG PKG_NAME
in the Dockerfile. This is also where any additional packages can be specified for installing into the image that will be produced.
The actual algorithms are written in {algorithm_name}/methods.py
. Beforehand, it is useful to install the necessary vantage6 packages (including pandas, which is used in the example here and is likely to appear in most algorithms):
# (OPTIONAL) set up a virtual environment and activate it
python3 -m venv venv
source venv/bin/activate
# Install the requirements
pip install -r requirements.txt
Contains all the methods that can be called at the nodes. All regular definitions in this file that have the prefix RPC_
are callable by an external party. If you define a master method, it should not contain the prefix! The master and regular definitions both have their own signature. Master definitions have a client and data argument (and possible some other arguments), while the regular definition only has the data argument. The data argument is a pandas dataframe and the client argument is a ClientContainerProtocol
or ClientMockProtocol
from the vantage6-toolkit. Examples of the master and regular methods are provided in the methods.py
file, their signatures should look like this:
def some_master_name(client, data, *args, **kwargs):
# do something
pass
def RPC_some_regular_method(data, *args, **kwargs):
# do something
pass
After writing a new algorithm, it is useful to test it. The test.py
script has been provided to do this. Simply edit line 3 in this file to point to two different CSV files and change the algorithm name. This will serve as a preliminary test for the algorithm.
python test.py
If everything has been entered correctly in the setup stage, you should only have to build the image and push it to docker hub to be able to use it:
docker build -t your_username/algorithm_name .
docker push your_username/algorithm_name
At this stage much will depend on how the infrastructure has been set up - the correct URLs have to be entered and login details need to be correct. As an example, the run.py
file has been provided. This has been taken from this repository, which also contains an example infrastructure that can be used to simulate a real-world scenario (it also works with the example method.py
file provided). The details in the run.py
file are in place to use this sample infrastructure and have to be changed for a production environment. The main changes that will have to be made are the image
name and any (kw)args
. Then simply run the file using:
python run.py
See the documentation for detailed instructions on how to install and use the server and nodes.