Skip to content

Commit

Permalink
Merge pull request #72 from microsoft/master
Browse files Browse the repository at this point in the history
pull code
  • Loading branch information
chicm-ms authored Feb 25, 2020
2 parents 0856813 + ff2728c commit 9e97bed
Show file tree
Hide file tree
Showing 12 changed files with 120 additions and 24 deletions.
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,12 @@ Within the following table, we summarized the current NNI capabilities, we are g
<a href="docs/en_US/NAS/Overview.md">Neural Architecture Search</a>
<ul>
<ul>
<li><a href="docs/en_US/NAS/Overview.md#enas">ENAS</a></li>
<li><a href="docs/en_US/NAS/Overview.md#darts">DARTS</a></li>
<li><a href="docs/en_US/NAS/Overview.md#p-darts">P-DARTS</a></li>
<li><a href="docs/en_US/NAS/Overview.md#cdarts">CDARTS</a></li>
<li><a href="docs/en_US/NAS/ENAS.md">ENAS</a></li>
<li><a href="docs/en_US/NAS/DARTS.md">DARTS</a></li>
<li><a href="docs/en_US/NAS/PDARTS.md">P-DARTS</a></li>
<li><a href="docs/en_US/NAS/CDARTS.md">CDARTS</a></li>
<li><a href="docs/en_US/NAS/SPOS.md">SPOS</a></li>
<li><a href="docs/en_US/NAS/Proxylessnas.md">ProxylessNAS</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a> </li>
</ul>
</ul>
Expand Down Expand Up @@ -224,7 +226,7 @@ Note:

* If there is any privilege issue, add `--user` to install NNI in the user directory.
* Currently NNI on Windows supports local, remote and pai mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/Tutorial/FAQ.md). For FAQ on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWindows.md).
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/Tutorial/FAQ.md). For FAQ on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/InstallationWin.md#faq).

### **Verify installation**

Expand Down Expand Up @@ -288,7 +290,7 @@ You can use these commands to get more information about the experiment
## **Documentation**
* To learn about what's NNI, read the [NNI Overview](https://nni.readthedocs.io/en/latest/Overview.html).
* To get yourself familiar with how to use NNI, read the [documentation](https://nni.readthedocs.io/en/latest/index.html).
* To get started and install NNI on your system, please refer to [Install NNI](docs/en_US/Tutorial/Installation.md).
* To get started and install NNI on your system, please refer to [Install NNI](https://nni.readthedocs.io/en/latest/installation.html).

## **Contributing**
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
Expand All @@ -304,7 +306,7 @@ After getting familiar with contribution agreements, you are ready to create you
* If you have any questions on usage, review [FAQ](https://github.com/microsoft/nni/blob/master/docs/en_US/Tutorial/FAQ.md) first, if there are no relevant issues and answers to your question, try contact NNI dev team and users in [Gitter](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) or [File an issue](https://github.com/microsoft/nni/issues/new/choose) on GitHub.
* [Customize your own Tuner](docs/en_US/Tuner/CustomizeTuner.md)
* [Implement customized TrainingService](docs/en_US/TrainingService/HowToImplementTrainingService.md)
* [Implement a new NAS trainer on NNI](https://github.com/microsoft/nni/blob/master/docs/en_US/NAS/NasInterface.md#implement-a-new-nas-trainer-on-nni)
* [Implement a new NAS trainer on NNI](docs/en_US/NAS/Advanced.md)
* [Customize your own Advisor](docs/en_US/Tuner/CustomizeAdvisor.md)

## **External Repositories and References**
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/Compressor/ModelSpeedup.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ There are two types of pruning. One is fine-grained pruning, it does not change

## Design and Implementation

To speed up a model, the pruned layers should be replaced, either replaced with smaller layer for coarse-grained mask, or replaced with sparse kernel for fine-grained mask. Coarse-grained mask usually changes the shape of weights or input/output tensors, thus, we should do shape inference to check are there other unpruned layers should be replaced as well due to shape change. Therefore, in our design, there are two main steps: first, do shape inference to find out all the modules that should be replaced; second, replace the modules. The first step requires topology (i.e., connections) of the model, we use `jit.trace` to obtain the model grpah for PyTorch.
To speed up a model, the pruned layers should be replaced, either replaced with smaller layer for coarse-grained mask, or replaced with sparse kernel for fine-grained mask. Coarse-grained mask usually changes the shape of weights or input/output tensors, thus, we should do shape inference to check are there other unpruned layers should be replaced as well due to shape change. Therefore, in our design, there are two main steps: first, do shape inference to find out all the modules that should be replaced; second, replace the modules. The first step requires topology (i.e., connections) of the model, we use `jit.trace` to obtain the model graph for PyTorch.

For each module, we should prepare four functions, three for shape inference and one for module replacement. The three shape inference functions are: given weight shape infer input/output shape, given input shape infer weight/output shape, given output shape infer weight/input shape. The module replacement function returns a newly created module which is smaller.

Expand Down Expand Up @@ -102,4 +102,4 @@ input tensor: `torch.randn(64, 3, 32, 32)`
| 4 | 0.02521 | 0.014008 |
| 8 | 0.03386 | 0.023923 |
| 16 | 0.06042 | 0.046183 |
| 32 | 0.12421 | 0.087113 |
| 32 | 0.12421 | 0.087113 |
2 changes: 1 addition & 1 deletion examples/model_compress/APoZ_torch_cifar10.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10('./data.cifar10', train=True, download=True,
transform=transforms.Compose([
Expand Down
2 changes: 1 addition & 1 deletion examples/model_compress/BNN_quantizer_cifar10.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def adjust_learning_rate(optimizer, epoch):

def main():
torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10('./data.cifar10', train=True, download=True,
transform=transforms.Compose([
Expand Down
89 changes: 89 additions & 0 deletions examples/model_compress/DoReFaQuantizer_torch_mnist.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
import torch
import torch.nn.functional as F
from torchvision import datasets, transforms
from nni.compression.torch import DoReFaQuantizer


class Mnist(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv1 = torch.nn.Conv2d(1, 20, 5, 1)
self.conv2 = torch.nn.Conv2d(20, 50, 5, 1)
self.fc1 = torch.nn.Linear(4 * 4 * 50, 500)
self.fc2 = torch.nn.Linear(500, 10)
self.relu1 = torch.nn.ReLU6()
self.relu2 = torch.nn.ReLU6()
self.relu3 = torch.nn.ReLU6()

def forward(self, x):
x = self.relu1(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = self.relu2(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4 * 4 * 50)
x = self.relu3(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)


def train(model, quantizer, device, train_loader, optimizer):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('{:2.0f}% Loss {}'.format(100 * batch_idx / len(train_loader), loss.item()))

def test(model, device, test_loader):
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
test_loss += F.nll_loss(output, target, reduction='sum').item()
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)

print('Loss: {} Accuracy: {}%)\n'.format(
test_loss, 100 * correct / len(test_loader.dataset)))

def main():
torch.manual_seed(0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train=True, download=True, transform=trans),
batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train=False, transform=trans),
batch_size=1000, shuffle=True)

model = Mnist()
model = model.to(device)
configure_list = [{
'quant_types': ['weight'],
'quant_bits': {
'weight': 8,
}, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
'op_types':['Conv2d', 'Linear']
}]
quantizer = DoReFaQuantizer(model, configure_list)
quantizer.compress()

optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.5)
for epoch in range(10):
print('# Epoch {} #'.format(epoch))
train(model, quantizer, device, train_loader, optimizer)
test(model, device, test_loader)


if __name__ == '__main__':
main()
2 changes: 1 addition & 1 deletion examples/model_compress/L1_torch_cifar10.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10('./data.cifar10', train=True, download=True,
transform=transforms.Compose([
Expand Down
8 changes: 5 additions & 3 deletions examples/model_compress/MeanActivation_torch_cifar10.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import math
import os
import argparse
import torch
import torch.nn as nn
Expand Down Expand Up @@ -48,7 +49,7 @@ def main():

args = parser.parse_args()
torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10('./data.cifar10', train=True, download=True,
transform=transforms.Compose([
Expand Down Expand Up @@ -79,10 +80,11 @@ def main():
test(model, device, test_loader)
lr_scheduler.step(epoch)
torch.save(model.state_dict(), 'vgg16_cifar10.pth')

else:
assert os.path.isfile('vgg16_cifar10.pth'), "can not find checkpoint 'vgg16_cifar10.pth'"
model.load_state_dict(torch.load('vgg16_cifar10.pth'))
# Test base model accuracy
print('=' * 10 + 'Test on the original model' + '=' * 10)
model.load_state_dict(torch.load('vgg16_cifar10.pth'))
test(model, device, test_loader)
# top1 = 93.51%

Expand Down
4 changes: 2 additions & 2 deletions examples/model_compress/QAT_torch_quantizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cpu')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader(
Expand All @@ -67,7 +67,6 @@ def main():
batch_size=1000, shuffle=True)

model = Mnist()

'''you can change this to DoReFaQuantizer to implement it
DoReFaQuantizer(configure_list).compress(model)
'''
Expand All @@ -86,6 +85,7 @@ def main():
quantizer = QAT_Quantizer(model, configure_list)
quantizer.compress()

model.to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
for epoch in range(10):
print('# Epoch {} #'.format(epoch))
Expand Down
5 changes: 3 additions & 2 deletions examples/model_compress/fpgm_torch_mnist.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cpu')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader(
Expand All @@ -83,6 +83,7 @@ def main():
batch_size=1000, shuffle=True)

model = Mnist()
model.to(device)
model.print_conv_filter_sparsity()

configure_list = [{
Expand All @@ -92,7 +93,7 @@ def main():

pruner = FPGMPruner(model, configure_list)
pruner.compress()

model.to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
for epoch in range(10):
pruner.update_epoch(epoch)
Expand Down
2 changes: 1 addition & 1 deletion examples/model_compress/main_torch_pruner.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/model_compress/pruning_kd.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10('./data.cifar10', train=True, download=True,
transform=transforms.Compose([
Expand Down
8 changes: 5 additions & 3 deletions examples/model_compress/slim_torch_cifar10.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import math
import os
import argparse
import torch
import torch.nn as nn
Expand Down Expand Up @@ -57,7 +58,7 @@ def main():
args = parser.parse_args()

torch.manual_seed(0)
device = torch.device('cuda')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10('./data.cifar10', train=True, download=True,
transform=transforms.Compose([
Expand Down Expand Up @@ -90,10 +91,11 @@ def main():
train(model, device, train_loader, optimizer, True)
test(model, device, test_loader)
torch.save(model.state_dict(), 'vgg19_cifar10.pth')

else:
assert os.path.isfile('vgg19_cifar10.pth'), "can not find checkpoint 'vgg19_cifar10.pth'"
model.load_state_dict(torch.load('vgg19_cifar10.pth'))
# Test base model accuracy
print('=' * 10 + 'Test the original model' + '=' * 10)
model.load_state_dict(torch.load('vgg19_cifar10.pth'))
test(model, device, test_loader)
# top1 = 93.60%

Expand Down

0 comments on commit 9e97bed

Please sign in to comment.