-
Notifications
You must be signed in to change notification settings - Fork 448
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
2 changed files
with
87 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
# About the Differentiable Architecture Search | ||
|
||
The algorithm follows the idea proposed in _DARTS: Differentiable Architecture Search_ by Hanxiao Liu, Karen Simonyan, Yiming Yang (https://arxiv.org/abs/1806.09055). | ||
|
||
The implementation is based on [official github implementation](https://github.com/quark0/darts) and [popular repository](https://github.com/khanrc/pt.darts). | ||
|
||
The algorithm addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. It is based on continuous relaxation and gradient descent in the search space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2). | ||
|
||
## Katib implementation | ||
|
||
To support DARTS in current Katib functionality the implementation follows this way: | ||
|
||
1. DARTS Suggestion service creates set of primitive operations from the Experiment search space. For example: `['separable_convolution_3x3', 'dilated_convolution_3x3', 'dilated_convolution_5x5', 'avg_pooling_3x3', 'max_pooling_3x3', 'skip_connection']`. | ||
|
||
2. Suggestion returns algorithm settings, number of layers and set of primitives to Katib Controller | ||
|
||
3. Katib controller starts training container with appropriate settings and all possible operations. | ||
|
||
4. Training container runs DARTS algorithm. | ||
|
||
5. Metrics collector saves Best Genotype from the training container log. | ||
|
||
Experiment example you can find [here](https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/nas/darts-example-gpu.yaml). | ||
You can find DARTS Suggestion service source code [here](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1alpha3/nas/darts) and DARTS training container implementation [here](https://github.com/kubeflow/katib/tree/master/examples/v1alpha3/nas/darts-cnn-cifar10). | ||
|
||
### Best Genotype representation | ||
|
||
Best Genotype is the best cell for each neural network layer. Cells are generated by DARTS algorithm. | ||
Here is an example of the Best Genotype: | ||
|
||
``` | ||
Genotype( | ||
normal=[ | ||
[('max_pooling_3x3',0),('max_pooling_3x3',1)], | ||
[('max_pooling_3x3',0),('max_pooling_3x3',1)], | ||
[('max_pooling_3x3',0),('dilated_convolution_3x3',3)], | ||
[('max_pooling_3x3',0),('max_pooling_3x3',1)] | ||
], | ||
normal_concat=range(2,6), | ||
reduce=[ | ||
[('dilated_convolution_5x5',1),('separable_convolution_3x3',0)], | ||
[('max_pooling_3x3',2),('dilated_convolution_5x5',1)], | ||
[('dilated_convolution_5x5',3),('dilated_convolution_5x5',2)], | ||
[('dilated_convolution_5x5',3),('dilated_convolution_5x5',4)] | ||
], | ||
reduce_concat=range(2,6) | ||
) | ||
``` | ||
|
||
In this example you can see 4 DARTS nodes with indexes: 2,3,4,5. | ||
|
||
`reduce` parameter is the cells which located at the 1/3 and 2/3 of the total neural network layers. They represent reduction cells in which all the operations adjacent to the input nodes are of stride two. | ||
|
||
`normal` parameter is the cells which is located at the rest neural network layers. They represent normal cell. | ||
|
||
In CNN all reduce and normal intermediate nodes are concatenated and each node has 2 edges. | ||
|
||
Each element in `normal` array is the node which has 2 edges. First element is the operation on the edge and second element is the node index connection. Note that index 0 is the `C_{k-2}` node and index 1 is the `C_{k-1}` node. | ||
|
||
For example `[('max_pooling_3x3',0),('max_pooling_3x3',1)]` means that `C_{k-2}` node connects to the first node with `max_pooling_3x3` operation (Max Pooling with filter size 3) and `C_{k-1}` node connects to the first node with `max_pooling_3x3` operation. | ||
|
||
`reduce` array follows the same way as `normal` array. | ||
|
||
`normal_concat` and `reduce_concat` means concatenation between intermediate nodes. | ||
|
||
Currently, it supports running only on single GPU and second-order approximation, which produced better results than first-order. | ||
|
||
## TODO list | ||
|
||
- Extend algorithm settings with algorithm parameters. | ||
|
||
- Integrate E2E test in CI. Create simple example, which can run on CPU. | ||
|
||
- Add validation to Suggestion service. | ||
|
||
- Support multi GPU training. Add functionality to select GPU for training. | ||
|
||
- Support DARTS in Katib UI. | ||
|
||
- Think about better representation of Best Genotype. | ||
|
||
- Add more dataset for CNN. Currently, it supports only CIFAR-10. | ||
|
||
- Support RNN in addition to CNN. | ||
|
||
- Support micro mode, which means searching for a particular neural network cell. |