DARTS documentation (#1180)

* README for DARTS * Fix docs
kubeflow · May 8, 2020 · 99de8f8 · 99de8f8
1 parent 581562a
commit 99de8f8
Show file tree

Hide file tree

Showing 2 changed files with 87 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -97,6 +97,7 @@ Currently Katib supports the following exploration algorithms:
 #### Neural Architecture Search
 
 * [Efficient Neural Architecture Search (ENAS)](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1alpha3/nas/enas)
+* [Differentiable Architecture Search (DARTS)](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1alpha3/nas/darts)
 
 
 ## Components in Katib

diff --git a/pkg/suggestion/v1alpha3/nas/darts/README.md b/pkg/suggestion/v1alpha3/nas/darts/README.md
@@ -0,0 +1,86 @@
+# About the Differentiable Architecture Search
+
+The algorithm follows the idea proposed in _DARTS: Differentiable Architecture Search_ by Hanxiao Liu, Karen Simonyan, Yiming Yang (https://arxiv.org/abs/1806.09055).
+
+The implementation is based on [official github implementation](https://github.com/quark0/darts) and [popular repository](https://github.com/khanrc/pt.darts).
+
+The algorithm addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. It is based on continuous relaxation and gradient descent in the search space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2).
+
+## Katib implementation
+
+To support DARTS in current Katib functionality the implementation follows this way:
+
+1. DARTS Suggestion service creates set of primitive operations from the Experiment search space. For example: `['separable_convolution_3x3', 'dilated_convolution_3x3', 'dilated_convolution_5x5', 'avg_pooling_3x3', 'max_pooling_3x3', 'skip_connection']`.
+
+2. Suggestion returns algorithm settings, number of layers and set of primitives to Katib Controller
+
+3. Katib controller starts training container with appropriate settings and all possible operations.
+
+4. Training container runs DARTS algorithm.
+
+5. Metrics collector saves Best Genotype from the training container log.
+
+Experiment example you can find [here](https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/nas/darts-example-gpu.yaml).
+You can find DARTS Suggestion service source code [here](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1alpha3/nas/darts) and DARTS training container implementation [here](https://github.com/kubeflow/katib/tree/master/examples/v1alpha3/nas/darts-cnn-cifar10).
+
+### Best Genotype representation
+
+Best Genotype is the best cell for each neural network layer. Cells are generated by DARTS algorithm.
+Here is an example of the Best Genotype:
+
+```
+Genotype(
+  normal=[
+      [('max_pooling_3x3',0),('max_pooling_3x3',1)],
+      [('max_pooling_3x3',0),('max_pooling_3x3',1)],
+      [('max_pooling_3x3',0),('dilated_convolution_3x3',3)],
+      [('max_pooling_3x3',0),('max_pooling_3x3',1)]
+    ],
+    normal_concat=range(2,6),
+  reduce=[
+      [('dilated_convolution_5x5',1),('separable_convolution_3x3',0)],
+      [('max_pooling_3x3',2),('dilated_convolution_5x5',1)],
+      [('dilated_convolution_5x5',3),('dilated_convolution_5x5',2)],
+      [('dilated_convolution_5x5',3),('dilated_convolution_5x5',4)]
+    ],
+    reduce_concat=range(2,6)
+)
+```
+
+In this example you can see 4 DARTS nodes with indexes: 2,3,4,5.
+
+`reduce` parameter is the cells which located at the 1/3 and 2/3 of the total neural network layers. They represent reduction cells in which all the operations adjacent to the input nodes are of stride two.
+
+`normal` parameter is the cells which is located at the rest neural network layers. They represent normal cell.
+
+In CNN all reduce and normal intermediate nodes are concatenated and each node has 2 edges.
+
+Each element in `normal` array is the node which has 2 edges. First element is the operation on the edge and second element is the node index connection. Note that index 0 is the `C_{k-2}` node and index 1 is the `C_{k-1}` node.
+
+For example `[('max_pooling_3x3',0),('max_pooling_3x3',1)]` means that `C_{k-2}` node connects to the first node with `max_pooling_3x3` operation (Max Pooling with filter size 3) and `C_{k-1}` node connects to the first node with `max_pooling_3x3` operation.
+
+`reduce` array follows the same way as `normal` array.
+
+`normal_concat` and `reduce_concat` means concatenation between intermediate nodes.
+
+Currently, it supports running only on single GPU and second-order approximation, which produced better results than first-order.
+
+## TODO list
+
+- Extend algorithm settings with algorithm parameters.
+
+- Integrate E2E test in CI. Create simple example, which can run on CPU.
+
+- Add validation to Suggestion service.
+
+- Support multi GPU training. Add functionality to select GPU for training.
+
+- Support DARTS in Katib UI.
+
+- Think about better representation of Best Genotype.
+
+- Add more dataset for CNN. Currently, it supports only CIFAR-10.
+
+- Support RNN in addition to CNN.
+
+- Support micro mode, which means searching for a particular neural network cell.