Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to CLI and deployment docs. Closes #1517 #1613

Merged
merged 18 commits into from
Apr 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 0 additions & 45 deletions docs/deployment/dockerized.rst

This file was deleted.

28 changes: 1 addition & 27 deletions docs/deployment/native.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,17 +55,7 @@ By default, DeepForge will start on `http://localhost:8888`. However, the port c

Worker
~~~~~~
The DeepForge worker can be started with

.. code-block:: bash

deepforge start --worker

To connect to a remote deepforge instance, add the url of the DeepForge server:

.. code-block:: bash

deepforge start --worker http://myaddress.com:1234
The DeepForge worker (used with WebGME compute) can be used to enable users to connect their own machines to use for any required computation. This can be installed from `https://github.com/deepforge-dev/worker`. It is recommended to install `Conda <https://conda.io/en/latest/>`_ on the worker machine so any dependencies can be automatically installed.

Updating
~~~~~~~~
Expand Down Expand Up @@ -109,22 +99,6 @@ and navigate to `http://localhost:8888` to start using DeepForge!

Alternatively, if jobs are going to be executed on an external worker, run `./bin/deepforge start -s` locally and navigate to `http://localhost:8888`.

DeepForge Worker
~~~~~~~~~~~~~~~~
If you are using `./bin/deepforge start -s` you will need to set up a DeepForge worker (`./bin/deepforge start` starts a local worker for you!). DeepForge workers are slave machines connected to DeepForge which execute the provided jobs. This allows the jobs to access the GPU, etc, and provides a number of benefits over trying to perform deep learning tasks in the browser.

Once DeepForge is installed on the worker, start it with

.. code-block:: bash

./bin/deepforge start -w

Note: If you are running the worker on a different machine, put the address of the DeepForge server as an argument to the command. For example:

.. code-block:: bash

./bin/deepforge start -w http://myaddress.com:1234

Updating
~~~~~~~~
Updating can be done the same as any other git project; that is, by running `git pull` from the project root. Sometimes, the dependencies need to be updated so it is recommended to run `npm install` following `git pull`.
Expand Down
11 changes: 4 additions & 7 deletions docs/deployment/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,17 @@ DeepForge Component Overview
----------------------------
DeepForge is composed of four main elements:

- *Server*: Main component hosting all the project information and is connected to by the clients.
- *Database*: MongoDB database containing DeepForge, job queue for the workers, etc.
- *Worker*: Slave machine performing the actual machine learning computation.
- *Client*: The connected browsers working on DeepForge projects.

Of course, only the *Server*, *Database* (MongoDB) and *Worker* need to be installed. If you are not going to execute any machine learning pipelines, installing the *Worker* can be skipped.
- *Server*: Main component hosting all the project information and is connected to by the clients.
- *Compute*: Connected computational resources used for executing pipelines.
- *Storage*: Connected storage resources used for storing project data artifacts such as datasets or trained model weights.

Component Dependencies
----------------------
The following dependencies are required for each component:

- *Server* (NodeJS v8.11.3)
- *Server* (NodeJS LTS)
- *Database* (MongoDB v3.0.7)
- *Worker*: NodeJS v8.11.3 (used for job management logic) and Python 3. If you are using the deepforge-keras extension, you will also need Keras and `TensorFlow <https://tensorflow.org>`_ installed.
- *Client*: We recommend using Google Chrome and are not supporting other browsers (for now). In other words, other browsers can be used at your own risk.

Configuration
Expand Down
19 changes: 19 additions & 0 deletions docs/deployment/quick_start.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Quick Start
===========
The recommended (and easiest) way to get started with DeepForge is using docker-compose. First, install `docker <https://docs.docker.com/engine/installation/>`_ and `docker-compose <https://docs.docker.com/compose/install/>`_.

Next, download the docker-compose file for DeepForge:

.. code-block:: bash

wget https://raw.githubusercontent.com/deepforge-dev/deepforge/master/docker/docker-compose.yml

Then start DeepForge using docker-compose:

.. code-block:: bash

docker-compose up

and now DeepForge can be used by opening a browser to `http://localhost:8888 <http://localhost:8888>`_!

For detailed instructions about deployment installations, check out our `deployment installation instructions <../getting_started/configuration.rst>`_ An example of customizing a deployment using docker-compose can be found `here <https://github.com/deepforge-dev/deepforge/tree/master/.deployment>`_.
133 changes: 96 additions & 37 deletions docs/fundamentals/custom_operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,68 +9,127 @@ Operations are used in pipelines and have named inputs and outputs. When creatin

.. figure:: operation_editor.png
:align: center
:scale: 45 %

Editing the "Train" operation from the "CIFAR10" example
Editing the "TrainValidate" operation from the "redshift" example

The interface editor is provided on the left and presents the interface as a diagram showing the input data and output data as objects flowing into or out of the given operation. Selecting the operation node in the operation interface editor will expand the node and allow the user to add or edit attributes for the given operation. These attributes are exposed when using this operation in a pipeline and can be set at design time - that is, these are set when creating the given pipeline. The interface diagram may also contain light blue nodes flowing into the operation. These nodes represent "references" that the operation accepts as input before running. When using the operation, references will appear alongside the attributes but will allow the user to select from a list of all possible targets when clicked.
The interface editor is provided on the right and presents the interface as a diagram showing the input data and output data as objects flowing into or out of the given operation. Selecting the operation node in the operation interface editor will expand the node and allow the user to add or edit attributes for the given operation. These attributes are exposed when using this operation in a pipeline and can be set at design time - that is, these are set when creating the given pipeline. The interface diagram may also contain light blue nodes flowing into the operation. These nodes represent "references" that the operation accepts as input before running. When using the operation, references will appear alongside the attributes but will allow the user to select from a list of all possible targets when clicked.

.. figure:: operation_interface.png
:align: center
:scale: 85 %

The train operation accepts training data, a model and attributes for shuffling data, setting the batch size, and the number of epochs.
The TrainValidate operation accepts training data, a model and attributes for setting the batch size, and the number of epochs.

On the right of the operation editor is the implementation editor. The implementation editor is a code editor specially tailored for programming the implementations of operations in DeepForge. It also is synchronized with the interface editor. A section of the implementation is shown below:
The operation editor also provides an interface to specify operation python dependencies. DeepForge uses
:code:`conda` to manage python dependencies for an operation. This pairs well with the integration of various compute platforms that available to the user and the only requirement for a user is to have Conda installed in their computing platform. You can specify operation dependencies using a conda environment `file <https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#create-env-file-manually>`_ as shown in the diagram below:


.. figure:: operation_environment.png
:align: center

The operation environment contains python dependencies for the given operation.

To the left of the operation editor is the implementation editor. The implementation editor is a code editor specially tailored for programming the implementations of operations in DeepForge. It also is synchronized with the interface editor. A section of the implementation is shown below:

.. code:: python

import numpy as np
from sklearn.model_selection import train_test_split
import keras
import time
from matplotlib import pyplot as plt

class Train():
def __init__(self, model, shuffle=True, epochs=100, batch_size=32):
self.model = model

self.epochs = epochs
self.shuffle = shuffle
self.batch_size = batch_size
return

import tensorflow as tf

def execute(self, training_data):
(x_train, y_train) = training_data
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)
self.model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
plot_losses = PlotLosses()
self.model.fit(x_train, y_train,
self.batch_size,
epochs=self.epochs,
callbacks=[plot_losses],
shuffle=self.shuffle)

model = self.model
return model
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)

The "Train" operation uses capabilities from the :code:`keras` package to train the neural network. This operation sets all the parameters using values provided to the operation as either attributes or references. In the implementation, attributes are provided as arguments to the constructor making the user defined attributes accessible from within the implementation. References are treated similarly to operation inputs and are also arguments to the constructor. This can be seen with the :code:`model` constructor argument. Finally, operations return their outputs in the :code:`execute` method; in this example, it returns a single output named :code:`model`, that is, the trained neural network.
class TrainValidate():
def __init__(self, model, epochs=10, batch_size=32):
self.model=model
self.batch_size = batch_size
self.epochs = epochs
np.random.seed(32)
return

After defining the interface and implementation, we can now use the "Train" operation in our pipelines! An example is shown below.
def execute(self, dataset):
model=self.model
model.summary()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
X = dataset['X']
y = dataset['y']
y_cats = self.to_categorical(y)
model.fit(X, y_cats,
epochs=self.epochs,
batch_size=self.batch_size,
validation_split=0.15,
callbacks=[PlotLosses()])
return model.get_weights()

def to_categorical(self, y, max_y=0.4, num_possible_classes=32):
one_step = max_y / num_possible_classes
y_cats = []
for values in y:
y_cats.append(int(values[0] / one_step))
return y_cats

def datagen(self, X, y):
# Generates a batch of data
X1, y1 = list(), list()
n = 0
while 1:stash@{1}
for sample, label in zip(X, y):
n += 1
X1.append(sample)
y1.append(label)
if n == self.batch_size:
yield [[np.array(X1)], y1]
n = 0
X1, y1 = list(), list()


class PlotLosses(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.i = 0
self.x = []
self.losses = []

def on_epoch_end(self, epoch, logs={}):
self.x.append(self.i)
self.losses.append(logs.get('loss'))
self.i += 1

self.update()

def update(self):
plt.clf()
plt.title("Training Loss")
plt.ylabel("CrossEntropy Loss")
plt.xlabel("Epochs")
plt.plot(self.x, self.losses, label="loss")
plt.legend()
plt.show()

The "TrainValidate" operation uses capabilities from the :code:`keras` package to train the neural network. This operation sets all the parameters using values provided to the operation as either attributes or references. In the implementation, attributes are provided as arguments to the constructor making the user defined attributes accessible from within the implementation. References are treated similarly to operation inputs and are also arguments to the constructor. This can be seen with the :code:`model` constructor argument. Finally, operations return their outputs in the :code:`execute` method; in this example, it returns a single output named :code:`model`, that is, the trained neural network.

After defining the interface and implementation, we can now use the "TrainValidate" operation in our pipelines! An example is shown below.

.. figure:: train_operation.png
:align: center
:scale: 85 %

Using the "Train" operation in a pipeline
Using the "TrainValidate" operation in a pipeline

Operation feedback
Operation Feedback
------------------
Operations in DeepForge can generate metadata about its execution. This metadata is generated during the execution and provided back to the user in real-time. An example of this includes providing real-time plotting feedback. When implementing an operation in DeepForge, this metadata can be created using the :code:`matplotlib` plotting capabilities.

.. figure:: graph_example.png
.. figure:: plotloss.png
:align: center
:scale: 75 %

An example graph of the loss function while training a neural network

Detailed information about the available operation metadata types can be found in the `reference <reference/feedback_mechanisms.rst>`_.
An example graph of the loss function while training a neural network.
25 changes: 25 additions & 0 deletions docs/fundamentals/integration.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Storage and Compute Adapters
============================
DeepForge is designed to integrate with existing computational and storage resources and is not intended to be a competitor to existing HPC or object storage frameworks.
This integration is made possible through the use of compute and storage adapters. This section provides a brief description of these adapters as well as currently supported integrations.

Storage Adapters
----------------
Projects in DeepForge may contain artifacts which reference datasets, trained model weights, or other associated binary data. Although the project code, pipelines, and models are stored in MongoDB, this associated data is stored using a storage adapter. Storage adapters enable DeepForge to store this associated data using an appropriate storage resource, such as a object store w/ an S3-compatible API.
This also enables users to "bring their own storage" as they can connect their existing cyberinfrastructure to a public deployment of DeepForge.
Currently, DeepForge supports 3 different storage adapters:

1. S3 Storage: Object storage with an S3-compatible API such as `minio <https://play.min.io>`_ or `AWS S3 <https://aws.amazon.com/s3/>`_
2. SciServer Files Service : Files service from `SciServer <https://sciserver.org>`_
3. WebGME Blob Server : Blob storage provided by `WebGME <https://webgme.org/>`_

Compute Adapters
----------------
Similar to storage adapters, compute adapters enable DeepForge to integrate with existing cyberinfrastructure used for executing some computation or workflow. This is designed to allow users to leverage their existing HPC or other computational resources with DeepForge. Compute adapters provide an interface through which DeepForge is able to execute workflows (e.g., training a neural network) on external machines.

Currently, the following compute adapters are available:

1. WebGME Worker: A worker machine which polls for jobs via the `WebGME Executor Framework <https://github.com/webgme/webgme/wiki/GME-Executor-Framework>`_. Registered users can connect their own compute machines enabling them to use their personal desktops with DeepForge.
2. SciServer-Compute: Compute service offered by `SciServer <https://sciserver.org>`_
3. Server Compute: Execute the job on the server machine. This is similar to the execution model used by Jupyter notebook servers.

Binary file modified docs/fundamentals/operation_editor.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/operation_environment.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/fundamentals/operation_interface.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/plotloss.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/fundamentals/train_operation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading