Merge pull request #37 from microsoft/master

pull code
chicm-ms · Oct 23, 2019 · e7df061 · e7df061
2 parents 5a0e9c9 + bc80058
commit e7df061
Show file tree

Hide file tree

Showing 58 changed files with 1,086 additions and 584 deletions.
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ NNI (Neural Network Intelligence) is a toolkit to help users run automated machi
 The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud.
 
 
-### **NNI [v1.0](https://github.com/Microsoft/nni/blob/master/docs/en_US/Release_v1.0.md) has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
+### **NNI v1.1 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
 
 <p align="center">
   <a href="#nni-has-been-released"><img src="docs/img/overview.svg" /></a>
@@ -211,7 +211,7 @@ Linux and MacOS
 * Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.
 
 ```bash
-    git clone -b v1.0 https://github.com/Microsoft/nni.git
+    git clone -b v1.1 https://github.com/Microsoft/nni.git
     cd nni
     source install.sh
 ```
@@ -221,7 +221,7 @@ Windows
 * Run the following commands in an environment that has `python >=3.5`, `git` and `PowerShell`
 
 ```bash
-  git clone -b v1.0 https://github.com/Microsoft/nni.git
+  git clone -b v1.1 https://github.com/Microsoft/nni.git
   cd nni
   powershell -ExecutionPolicy Bypass -file install.ps1
 ```
@@ -232,12 +232,12 @@ For NNI on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWi
 
 **Verify install**
 
-The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow installed** before running it.
+The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow 1.x installed** before running it. Note that **currently Tensorflow 2.0 is NOT supported**.
 
 * Download the examples via clone the source code.
 
 ```bash
-    git clone -b v1.0 https://github.com/Microsoft/nni.git
+    git clone -b v1.1 https://github.com/Microsoft/nni.git
 ```
 
 Linux and MacOS

diff --git a/deployment/pypi/setup.py b/deployment/pypi/setup.py
@@ -73,7 +73,7 @@
         'requests',
         'astor',
         'PythonWebHDFS',
-        'hyperopt',
+        'hyperopt==0.1.2',
         'json_tricks',
         'numpy',
         'scipy',

diff --git a/docs/en_US/AdvancedFeature/AdvancedNas.md b/docs/en_US/AdvancedFeature/AdvancedNas.md
@@ -63,7 +63,7 @@ sudo mount -t nfs 10.10.10.10:/tmp/nni/shared /mnt/nfs/nni
 ```
 where `10.10.10.10` should be replaced by the real IP of NFS server machine in practice.
 
-## Asynchornous Dispatcher Mode for trial dependency control
+## Asynchronous Dispatcher Mode for trial dependency control
 The feature of weight sharing enables trials from different machines, in which most of the time **read after write** consistency must be assured. After all, the child model should not load parent model before parent trial finishes training. To deal with this, users can enable **asynchronous dispatcher mode** with `multiThread: true` in `config.yml` in NNI, where the dispatcher assign a tuner thread each time a `NEW_TRIAL` request comes in, and the tuner thread can decide when to submit a new trial by blocking and unblocking the thread itself. For example:
 ```python
     def generate_parameters(self, parameter_id):

diff --git a/docs/en_US/Compressor/AutoCompression.md b/docs/en_US/Compressor/AutoCompression.md
@@ -1,3 +1,118 @@
 # Automatic Model Compression on NNI
 
-TBD.
+It's convenient to implement auto model compression with NNI compression and NNI tuners
+
+## First, model compression with NNI
+
+You can easily compress a model with NNI compression. Take pruning for example, you can prune a pretrained model with LevelPruner like this
+
+```python
+from nni.compression.torch import LevelPruner
+config_list = [{ 'sparsity': 0.8, 'op_types': 'default' }]
+pruner = LevelPruner(config_list)
+pruner(model)
+```
+
+```{ 'sparsity': 0.8, 'op_types': 'default' }```means that **all layers with weight will be compressed with the same 0.8 sparsity**. When ```pruner(model)``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked.
+
+## Then, make this automatic
+
+The previous example manually choosed LevelPruner and pruned all layers with the same sparsity, this is obviously sub-optimal because different layers may have different redundancy. Layer sparsity should be carefully tuned to achieve least model performance degradation and this can be done with NNI tuners.
+
+The first thing we need to do is to design a search space, here we use a nested search space which contains  choosing pruning algorithm and optimizing layer sparsity.
+
+```json
+{
+  "prune_method": {
+    "_type": "choice",
+    "_value": [
+      {
+        "_name": "agp",
+        "conv0_sparsity": {
+          "_type": "uniform",
+          "_value": [
+            0.1,
+            0.9
+          ]
+        },
+        "conv1_sparsity": {
+          "_type": "uniform",
+          "_value": [
+            0.1,
+            0.9
+          ]
+        },
+      },
+      {
+        "_name": "level",
+        "conv0_sparsity": {
+          "_type": "uniform",
+          "_value": [
+            0.1,
+            0.9
+          ]
+        },
+        "conv1_sparsity": {
+          "_type": "uniform",
+          "_value": [
+            0.01,
+            0.9
+          ]
+        },
+      }
+    ]
+  }
+}
+```
+
+Then we need to modify our codes for few lines
+
+```python
+import nni
+from nni.compression.torch import *
+params = nni.get_parameters()
+conv0_sparsity = params['prune_method']['conv0_sparsity']
+conv1_sparsity = params['prune_method']['conv1_sparsity']
+# these raw sparsity should be scaled if you need total sparsity constrained
+config_list_level = [{ 'sparsity': conv0_sparsity, 'op_name': 'conv0' },
+                     { 'sparsity': conv1_sparsity, 'op_name': 'conv1' }]
+config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity,
+                    'start_epoch': 0, 'end_epoch': 3,
+                    'frequency': 1,'op_name': 'conv0' },
+                   {'initial_sparsity': 0, 'final_sparsity': conv1_sparsity,
+                    'start_epoch': 0, 'end_epoch': 3,
+                    'frequency': 1,'op_name': 'conv1' },]
+PRUNERS = {'level':LevelPruner(config_list_level)，'agp':AGP_Pruner(config_list_agp)}
+pruner = PRUNERS(params['prune_method']['_name'])
+pruner(model)
+... # fine tuning
+acc = evaluate(model) # evaluation
+nni.report_final_results(acc)
+```
+
+Last, define our task and automatically tuning pruning methods with layers sparsity
+
+```yaml
+authorName: default
+experimentName: Auto_Compression
+trialConcurrency: 2
+maxExecDuration: 100h
+maxTrialNum: 500
+#choice: local, remote, pai
+trainingServicePlatform: local
+#choice: true, false
+useAnnotation: False
+searchSpacePath: search_space.json
+tuner:
+  #choice: TPE, Random, Anneal...
+  builtinTunerName: TPE
+  classArgs:
+    #choice: maximize, minimize
+    optimize_mode: maximize
+trial:
+  command: bash run_prune.sh
+  codeDir: .
+  gpuNum: 1
+
+```
+
diff --git a/docs/en_US/Compressor/Overview.md b/docs/en_US/Compressor/Overview.md
@@ -1,14 +1,16 @@
 # Compressor
+
+We are glad to announce the alpha release for model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute.
+
 NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It supports Tensorflow and PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms).
 
 ## Supported algorithms
-We have provided two naive compression algorithms and four popular ones for users, including three pruning algorithms and three quantization algorithms:
+We have provided two naive compression algorithms and three popular ones for users, including two pruning algorithms and three quantization algorithms:
 
 |Name|Brief Introduction of Algorithm|
 |---|---|
 | [Level Pruner](./Pruner.md#level-pruner) | Pruning the specified ratio on each weight based on absolute values of weights |
-| [AGP Pruner](./Pruner.md#agp-pruner) | To prune, or not to prune: exploring the efficacy of pruning for model compression. [Reference Paper](https://arxiv.org/abs/1710.01878)|
-| [Sensitivity Pruner](./Pruner.md#sensitivity-pruner) | Learning both Weights and Connections for Efficient Neural Networks. [Reference Paper](https://arxiv.org/abs/1506.02626)|
+| [AGP Pruner](./Pruner.md#agp-pruner) | Automated gradual pruning (To prune, or not to prune: exploring the efficacy of pruning for model compression) [Reference Paper](https://arxiv.org/abs/1710.01878)|
 | [Naive Quantizer](./Quantizer.md#naive-quantizer) |  Quantize weights to default 8 bits |
 | [QAT Quantizer](./Quantizer.md#qat-quantizer) | Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. [Reference Paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf)|
 | [DoReFa Quantizer](./Quantizer.md#dorefa-quantizer) | DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. [Reference Paper](https://arxiv.org/abs/1606.06160)|
@@ -72,7 +74,7 @@ It means following the algorithm's default setting for compressed operations wit
 
 ### Other APIs
 
-Some compression algorithms use epochs to control the progress of compression, and some algorithms need to do something after every minibatch. Therefore, we provide another two APIs for users to invoke. One is `update_epoch`, you can use it as follows:
+Some compression algorithms use epochs to control the progress of compression (e.g. [AGP](./Pruner.md#agp-pruner)), and some algorithms need to do something after every minibatch. Therefore, we provide another two APIs for users to invoke. One is `update_epoch`, you can use it as follows:
 
 Tensorflow code 
 ```python
@@ -138,7 +140,7 @@ Some algorithms may want global information for generating masks, for example, a
 
 The interface for customizing quantization algorithm is similar to that of pruning algorithms. The only difference is that `calc_mask` is replaced with `quantize_weight`. `quantize_weight` directly returns the quantized weights rather than mask, because for quantization the quantized weights cannot be obtained by applying mask.
 
-```
+```python
 # This is writing a Quantizer in tensorflow.
 # For writing a Quantizer in PyTorch, you can simply replace
 # nni.compression.tensorflow.Quantizer with

diff --git a/docs/en_US/Compressor/Pruner.md b/docs/en_US/Compressor/Pruner.md
@@ -38,7 +38,7 @@ In [To prune, or not to prune: exploring the efficacy of pruning for model compr
 >The binary weight masks are updated every ∆t steps as the network is trained to gradually increase the sparsity of the network while allowing the network training steps to recover from any pruning-induced loss in accuracy. In our experience, varying the pruning frequency ∆t between 100 and 1000 training steps had a negligible impact on the final model quality. Once the model achieves the target sparsity sf , the weight masks are no longer updated. The intuition behind this sparsity function in equation
 
 ### Usage
-You can prune all weight from %0 to 80% sparsity in 10 epoch with the code below.
+You can prune all weight from 0% to 80% sparsity in 10 epoch with the code below.
 
 First, you should import pruner and add mask to model.
 
@@ -48,7 +48,7 @@ from nni.compression.tensorflow import AGP_Pruner
 config_list = [{
     'initial_sparsity': 0,
     'final_sparsity': 0.8,
-    'start_epoch': 1,
+    'start_epoch': 0,
     'end_epoch': 10,
     'frequency': 1,
     'op_types': 'default'
@@ -62,7 +62,7 @@ from nni.compression.torch import AGP_Pruner
 config_list = [{
     'initial_sparsity': 0,
     'final_sparsity': 0.8,
-    'start_epoch': 1,
+    'start_epoch': 0,
     'end_epoch': 10,
     'frequency': 1,
     'op_types': 'default'
@@ -86,47 +86,9 @@ You can view example for more information
 #### User configuration for AGP Pruner
 * **initial_sparsity:** This is to specify the sparsity when compressor starts to compress
 * **final_sparsity:** This is to specify the sparsity when compressor finishes to compress
-* **start_epoch:** This is to specify the epoch number when compressor starts to compress
+* **start_epoch:** This is to specify the epoch number when compressor starts to compress, default start from epoch 0
 * **end_epoch:** This is to specify the epoch number when compressor finishes to compress
-* **frequency:** This is to specify every *frequency* number epochs compressor compress once
+* **frequency:** This is to specify every *frequency* number epochs compressor compress once, default frequency=1
 
 ***
 
-## Sensitivity Pruner
-In [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626), author Song Han and provide an algorithm to find the sensitivity of each layer and set the pruning threshold to each layer.
-
->We used the sensitivity results to find each layer’s threshold: for example, the smallest threshold was applied to the most sensitive layer, which is the first convolutional layer... The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer’s weights
-
-### Usage
-You can prune weight step by step and reach one target sparsity by Sensitivity Pruner with the code below.
-
-Tensorflow code
-```python
-from nni.compression.tensorflow import SensitivityPruner
-config_list = [{ 'sparsity':0.8, 'op_types': 'default' }]
-pruner = SensitivityPruner(config_list)
-pruner(tf.get_default_graph())
-```
-PyTorch code
-```python
-from nni.compression.torch import SensitivityPruner
-config_list = [{ 'sparsity':0.8, 'op_types': 'default' }]
-pruner = SensitivityPruner(config_list)
-pruner(model)
-```
-Like AGP Pruner, you should update mask information every epoch by adding code below
-
-Tensorflow code 
-```python
-pruner.update_epoch(epoch, sess)
-```
-PyTorch code
-```python
-pruner.update_epoch(epoch)
-```
-You can view example for more information
-
-#### User configuration for Sensitivity Pruner
-* **sparsity:** This is to specify the sparsity operations to be compressed to
-
-***