Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

ENAS and DRATS search space zoo #2589

Merged
merged 47 commits into from
Jul 27, 2020
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
673cf3d
add darts cell and search space
tabVersion Jun 23, 2020
72f9f12
move to search_space_zoo
tabVersion Jun 23, 2020
99c841b
accept a cell to build full model
tabVersion Jun 24, 2020
e0e9e2c
fix compile error
tabVersion Jun 24, 2020
b55c6cd
bug fix
tabVersion Jun 24, 2020
3e162f4
change DartsCell signiture
tabVersion Jun 27, 2020
181e4c1
format code
tabVersion Jun 27, 2020
e35ff4b
change signature & inherit sequencial
tabVersion Jun 29, 2020
7896cb4
add search space example
tabVersion Jun 29, 2020
cd4eb1f
structure adjust & comment change
tabVersion Jun 29, 2020
736d196
clearify darts search space doc
tabVersion Jun 29, 2020
8c4f0bc
move dartsStackCells to example
tabVersion Jun 30, 2020
cf720c9
update docs
tabVersion Jul 3, 2020
823b0be
Merge branch 'master' into darts
tabVersion Jul 3, 2020
5d41f19
doc missing fix
tabVersion Jul 3, 2020
510dc38
Merge branch 'darts' of https://github.com/tabVersion/nni into darts
tabVersion Jul 3, 2020
03f7a28
doc fix
tabVersion Jul 6, 2020
40c517a
change code to fix doc
tabVersion Jul 6, 2020
8696f96
enas test
tabVersion Jul 6, 2020
1a46bf0
enas test
tabVersion Jul 6, 2020
40ab64a
enas test
tabVersion Jul 6, 2020
473e247
enas micro
tabVersion Jul 6, 2020
94f3eba
code format & doc fix & add example
tabVersion Jul 6, 2020
9efdb8a
refine doc
tabVersion Jul 7, 2020
5e2ed66
code format
tabVersion Jul 7, 2020
4cfdb10
add enas micro doc
tabVersion Jul 8, 2020
6a7a6ba
fix trailing whitespace
tabVersion Jul 8, 2020
8ddf8f1
add enas macro
tabVersion Jul 9, 2020
c0aecff
format doc
tabVersion Jul 9, 2020
c785024
fix doc
tabVersion Jul 9, 2020
0316d31
fix systax
tabVersion Jul 9, 2020
f6e9565
fix
tabVersion Jul 9, 2020
1b0c398
refine doc
tabVersion Jul 11, 2020
5b3dc94
refine doc
tabVersion Jul 13, 2020
f12df2c
update
tabVersion Jul 13, 2020
6cdfc5c
refine
tabVersion Jul 14, 2020
ec6ac2b
refine doc
tabVersion Jul 15, 2020
7ef03a6
refine doc
tabVersion Jul 16, 2020
199efb7
doc refine
tabVersion Jul 20, 2020
5b9c3ae
change sketch
tabVersion Jul 22, 2020
d5f63e2
change illustration
tabVersion Jul 24, 2020
05875f0
resolution fix
tabVersion Jul 24, 2020
2a5c434
update doc
tabVersion Jul 24, 2020
1d12bec
update doc
tabVersion Jul 24, 2020
de282c5
update doc
tabVersion Jul 24, 2020
63b20ce
doc
tabVersion Jul 24, 2020
8eb2afa
adjust menu sequence
tabVersion Jul 27, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/en_US/NAS/Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,17 @@ Please refer to [here](NasGuide.md) for the usage of one-shot NAS algorithms.
One-shot NAS can be visualized with our visualization tool. Learn more details [here](./Visualization.md).



## Search Space Zoo

NNI provides some predefined search space which can be easily reused. By stacking the extracted cells, user can quickly reproduce those NAS models.

Search Space Zoo contains the following NAS cells:

* [DartsCell](./SearchSpaceZoo.md#DartsCell)
* [ENAS micro](./SearchSpaceZoo.md#ENASMicroLayer)
* [ENAS macro](./SearchSpaceZoo.md#ENASMacroLayer)

## Using NNI API to Write Your Search Space

The programming interface of designing and searching a model is often demanded in two scenarios.
Expand Down
160 changes: 160 additions & 0 deletions docs/en_US/NAS/SearchSpaceZoo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# Search Space Zoo

## DartsCell

DartsCell is extracted from [CNN model](./DARTS.md) designed in this repo. [Operations](#darts-predefined-operations) connecting with nodes which contained in the cell strucure is fixed.

The predefined operations are shown as follows:

* MaxPool: call `torch.nn.MaxPool2d`. This operation applies a 2D max pooling over all input channels. Its parameters `kernal_size=3` and `padding=1` are fixed.
* AvgPool: call `torch.nn.AvgPool2d`. This operation applies a 2D average pooling over all input channels. Its parameters `kernal_size=3` and `padding=1` are fixed.
* Skip Connect: There is no operation between two nodes. Call `torch.nn.Identity` to forward what it gets to the output.
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
* SepConv3x3: Composed of two DilConvs with fixed `kernal_size=3` sequentially.
* SepConv5x5: Do the same operation as the previous one but it has different kernal size, which is set to 5.
* <a name="DilConv"></a>DilConv3x3: (Dilated) depthwise separable Conv. It first calls `torch.nn.Conv2d` with fixed `kernal_size=3` to partition the feature map into `C_in` groups then applies 1x1 Convolution to get `C_out` output channels. It makes extracting features on every channel separately possible and reduces the number of parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"It makes extracting features on every channel separately possible". Are you sure this is a correct sentence?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not DilConv? Maybe we should rename it. Or we can state that "we follow the convention in NAS papers to name it DilConv".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe DepthwiseSepConv?

tabVersion marked this conversation as resolved.
Show resolved Hide resolved
* DilConv5x5: Do the same operation as the previous one but it has different kernal size, which is set to 5.
tabVersion marked this conversation as resolved.
Show resolved Hide resolved

```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.DartsCell
:members:
```

### Example Code

[example code](https://github.com/microsoft/nni/tree/master/examples/nas/search_space_zoo/darts_example.py)

```bash
git clone https://github.com/Microsoft/nni.git
cd nni/examples/nas/search_space_zoo
# search the best structure
python3 darts_example.py
```

<a class="predefined-operations-darts"></a>

### DARTS predefined operations
tabVersion marked this conversation as resolved.
Show resolved Hide resolved

* MaxPool / AvgPool

MaxPool / AvgPool with `kernal_size=3` and `padding=1` followed by BatchNorm2d
```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.darts_ops.PoolBN
```
* Skip Connection

There is no connection between the two nodes.
* DilConv3x3 / DilConv5x5

Dilated Conv with `kernal_size=3` or `kernal_size=5` and `padding=1`
```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.darts_ops.DilConv
```
* SepConv3x3 / SepConv5x5

Depthwise separable Conv with `kernal_size=3` or `kernal_size=5` and `padding=1`
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.darts_ops.SepConv
```

## ENASMicroLayer

This layer is extracted from model designed [here](./ENAS.md). A model contains several blocks whose architecture keeps the same. A block is made up of some `ENAMicroLayer`
and one `ENASReduceLayer`. The only difference between the two layers is that `ENASReduceLayer` applies all operations with `stride=2`.

An `ENASMicroLayer` contains `num_nodes` nodes and searches the topology among them. The first two nodes in a layer stand for the outputs from previous previous layer and previous layer respectively. The following nodes choose two previous nodes as input and apply two operations from [predefined ones](#predefined-operations-enas) then add them as the output of this node. For example, Node 4 chooses Node 1 and Node 3 as inputs then apply `MaxPool` and `AvgPool` on the inputs respectively. So the output of
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
Node 4 is `MaxPool(Node 1)+AvgPool(Node 3)`. Nodes that are not served as input for other nodes are viewed as the output of the layer. If there are multiple output nodes,
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
the model will concat them in channels as the layer output.

The predefined operations are listed as follows. Details can be seen [here](#predefined-operations-enas).

* MaxPool: call `torch.nn.MaxPool2d`. This operation applies a 2D max pooling over all input channels. Its parameters are fixed to `kernal_size=3`, `stride=1` and `padding=1`.
* AvgPool: call `torch.nn.AvgPool2d`. This operation applies a 2D average pooling over all input channels. Its parameters are fixed to `kernal_size=3`, `stride=1` and `padding=1`.
* SepConvBN3x3: ReLU followed by a [DilConv](#DilConv) and BatchNorm. Convilution parameters are `kernal_size=3`, `stride=1` and `padding=1`.
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
* SepConvBN5x5: Do the same operation as the previous one but it has different kernal size, which is set to 5.
* Skip Connect: There is no operation between two nodes. Call `torch.nn.Identity` to forward what it gets to the output.

```eval_rst
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
.. autoclass:: nni.nas.pytorch.search_space_zoo.ENASMicroLayer
:members:
```

The Reduction Layer is made up by two Conv operations, each of them will output `C_out//2` channels and concat them in channels as the output. The Convolutions have `kernal_size=1`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you merge the code of MicroLayer and ReductionLayer maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do so, we must pass in a moduleList from MicroNetwork, which is irrelevant to search space.

tabVersion marked this conversation as resolved.
Show resolved Hide resolved
and `stride=2`, and they perform alternate sampling on the input so as to reduce the resolution without lossing information.
tabVersion marked this conversation as resolved.
Show resolved Hide resolved

```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.ENASReductionLayer
:members:
```

### Example Code

[example code](https://github.com/microsoft/nni/tree/master/examples/nas/search_space_zoo/enas_micro_example.py)

```bash
git clone https://github.com/Microsoft/nni.git
cd nni/examples/nas/search_space_zoo
# search the best cell structure
python3 enas_micro_example.py
```

<a name="predefined-operations-enas"></a>

### ENAS Micro predefined operations

* MaxPool / AvgPool

MaxPool / AvgPool with `kernal_size=3`, `stride=1` and `padding=1` followed by BatchNorm2d
```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.enas_ops.Pool
```

* SepConv

<!-- MaxPool / AvgPool with `kernal_size=3`, `stride=1` and `padding=1` followed by BatchNorm2d -->
```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.enas_ops.SepConvBN
```

* Skip Connection

There is no connection between the two nodes.
tabVersion marked this conversation as resolved.
Show resolved Hide resolved

## ENASMacroLayer

In Macro search, the controller makes two decisions for each layer:L i) the [operation](#macro-operations) to perform on the previous layer, ii) the previous layer to connect to for skip connections. NNI privides [predefined operations](#macro-operations) for macro search, which are listed as following:
tabVersion marked this conversation as resolved.
Show resolved Hide resolved

* Conv3x3(separable and non-separable): Conv parameters are fixed `kernal_size=3`, `padding=1` and `stride=1`. If `separable=True`, Conv is replaced with [DilConv](#DilConv).
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
* Conv5x5(separable and non-separable): Do the same operation as the previous one but it has different kernal size, which is set to 5.
* AvgPool: call `torch.nn.AvgPool2d`. This operation applies a 2D average pooling over all input channels. Its parameters are fixed to `kernal_size=3`, `stride=1` and `padding=1`.
* MaxPool: call `torch.nn.MaxPool2d`. This operation applies a 2D max pooling over all input channels. Its parameters are fixed to `kernal_size=3`, `stride=1` and `padding=1`.

```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.ENASMacroLayer
:members:
```

### Example Code

[example code](https://github.com/microsoft/nni/tree/master/examples/nas/search_space_zoo/enas_macro_example.py)

```bash
git clone https://github.com/Microsoft/nni.git
cd nni/examples/nas/search_space_zoo
# search the best cell structure
python3 enas_macro_example.py
```

<a name="macro-operations"></a>

### ENAS Macro predefined operations

* ConvBranch

```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.enas_ops.ConvBranch
```
* PoolBranch

```eval_rst
.. autoclass:: nni.nas.pytorch.search_space_zoo.enas_ops.PoolBranch
```
1 change: 1 addition & 0 deletions docs/en_US/nas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@ For details, please refer to the following tutorials:
NAS Visualization <NAS/Visualization>
NAS Benchmarks <NAS/Benchmarks>
API Reference <NAS/NasReference>
Search Space Zoo <NAS/SearchSpaceZoo>
2 changes: 1 addition & 1 deletion examples/nas/enas/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
parser = ArgumentParser("enas")
parser.add_argument("--batch-size", default=128, type=int)
parser.add_argument("--log-frequency", default=10, type=int)
parser.add_argument("--search-for", choices=["macro", "micro"], default="macro")
# parser.add_argument("--search-for", choices=["macro", "micro"], default="macro")
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
parser.add_argument("--epochs", default=None, type=int, help="Number of epochs (default: macro 310, micro 150)")
parser.add_argument("--visualization", default=False, action="store_true")
args = parser.parse_args()
Expand Down
53 changes: 53 additions & 0 deletions examples/nas/search_space_zoo/darts_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

import logging
import time
from argparse import ArgumentParser

import torch
import torch.nn as nn

import datasets
from nni.nas.pytorch.callbacks import ArchitectureCheckpoint, LRSchedulerCallback
from nni.nas.pytorch.darts import DartsTrainer
from utils import accuracy

from nni.nas.pytorch.search_space_zoo import DartsCell
from darts_search_space import DartsStackedCells

logger = logging.getLogger('nni')

if __name__ == "__main__":
parser = ArgumentParser("darts")
parser.add_argument("--layers", default=8, type=int)
parser.add_argument("--batch-size", default=64, type=int)
parser.add_argument("--log-frequency", default=10, type=int)
parser.add_argument("--epochs", default=50, type=int)
parser.add_argument("--channels", default=16, type=int)
parser.add_argument("--unrolled", default=False, action="store_true")
parser.add_argument("--visualization", default=False, action="store_true")
args = parser.parse_args()

dataset_train, dataset_valid = datasets.get_dataset("cifar10")

model = DartsStackedCells(3, args.channels, 10, args.layers, DartsCell)
criterion = nn.CrossEntropyLoss()

optim = torch.optim.SGD(model.parameters(), 0.025, momentum=0.9, weight_decay=3.0E-4)
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, args.epochs, eta_min=0.001)

trainer = DartsTrainer(model,
loss=criterion,
metrics=lambda output, target: accuracy(output, target, topk=(1,)),
optimizer=optim,
num_epochs=args.epochs,
dataset_train=dataset_train,
dataset_valid=dataset_valid,
batch_size=args.batch_size,
log_frequency=args.log_frequency,
unrolled=args.unrolled,
callbacks=[LRSchedulerCallback(lr_scheduler), ArchitectureCheckpoint("./checkpoints")])
if args.visualization:
trainer.enable_visualization()
trainer.train()
83 changes: 83 additions & 0 deletions examples/nas/search_space_zoo/darts_stack_cells.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

import torch.nn as nn
import ops


class DartsStackedCells(nn.Module):
"""
builtin Darts Search Space
Compared to Darts example, DartsSearchSpace removes Auxiliary Head, which
is considered as a trick rather than part of model.

Attributes
---
in_channels: int
the number of input channels
channels: int
the number of initial channels expected
n_classes: int
classes for final classification
n_layers: int
the number of cells contained in this network
factory_func: function
return a callable instance for demand cell structure.
user should pass in ``__init__`` of the cell class with required parameters (see nni.nas.DartsCell for detail)
tabVersion marked this conversation as resolved.
Show resolved Hide resolved
n_nodes: int
the number of nodes contained in each cell
stem_multiplier: int
channels multiply coefficient when passing a cell
"""

def __init__(self, in_channels, channels, n_classes, n_layers, factory_func, n_nodes=4,
stem_multiplier=3):
super().__init__()
self.in_channels = in_channels
self.channels = channels
self.n_classes = n_classes
self.n_layers = n_layers

c_cur = stem_multiplier * self.channels
self.stem = nn.Sequential(
nn.Conv2d(in_channels, c_cur, 3, 1, 1, bias=False),
nn.BatchNorm2d(c_cur)
)

# for the first cell, stem is used for both s0 and s1
# [!] channels_pp and channels_p is output channel size, but c_cur is input channel size.
channels_pp, channels_p, c_cur = c_cur, c_cur, channels

self.cells = nn.ModuleList()
reduction_p, reduction = False, False
for i in range(n_layers):
reduction_p, reduction = reduction, False
# Reduce featuremap size and double channels in 1/3 and 2/3 layer.
if i in [n_layers // 3, 2 * n_layers // 3]:
c_cur *= 2
reduction = True

cell = factory_func(n_nodes, channels_pp, channels_p, c_cur, reduction_p, reduction)
self.cells.append(cell)
c_cur_out = c_cur * n_nodes
channels_pp, channels_p = channels_p, c_cur_out

self.gap = nn.AdaptiveAvgPool2d(1)
self.linear = nn.Linear(channels_p, n_classes)

def forward(self, x):
s0 = s1 = self.stem(x)

for cell in self.cells:
s0, s1 = s1, cell(s0, s1)

out = self.gap(s1)
out = out.view(out.size(0), -1) # flatten
logits = self.linear(out)

return logits

def drop_path_prob(self, p):
for module in self.modules():
if isinstance(module, ops.DropPath):
module.p = p
56 changes: 56 additions & 0 deletions examples/nas/search_space_zoo/datasets.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

import numpy as np
import torch
from torchvision import transforms
from torchvision.datasets import CIFAR10


class Cutout(object):
def __init__(self, length):
self.length = length

def __call__(self, img):
h, w = img.size(1), img.size(2)
mask = np.ones((h, w), np.float32)
y = np.random.randint(h)
x = np.random.randint(w)

y1 = np.clip(y - self.length // 2, 0, h)
y2 = np.clip(y + self.length // 2, 0, h)
x1 = np.clip(x - self.length // 2, 0, w)
x2 = np.clip(x + self.length // 2, 0, w)

mask[y1: y2, x1: x2] = 0.
mask = torch.from_numpy(mask)
mask = mask.expand_as(img)
img *= mask

return img


def get_dataset(cls, cutout_length=0):
MEAN = [0.49139968, 0.48215827, 0.44653124]
STD = [0.24703233, 0.24348505, 0.26158768]
transf = [
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip()
]
normalize = [
transforms.ToTensor(),
transforms.Normalize(MEAN, STD)
]
cutout = []
if cutout_length > 0:
cutout.append(Cutout(cutout_length))

train_transform = transforms.Compose(transf + normalize + cutout)
valid_transform = transforms.Compose(normalize)

if cls == "cifar10":
dataset_train = CIFAR10(root="./data", train=True, download=True, transform=train_transform)
dataset_valid = CIFAR10(root="./data", train=False, download=True, transform=valid_transform)
else:
raise NotImplementedError
return dataset_train, dataset_valid
Loading