Add GP Tuner and related doc #1191

suiguoxin · 2019-06-21T13:15:24Z

add a pure GP tuner with Matern kernel
add related docs

…rs; fix link err in HpoComparision.md

…thedocs and github docs

xuehui1991 · 2019-06-23T01:26:04Z

docs/en_US/BuiltinTuner.md

@@ -19,7 +19,7 @@ Currently we support the following algorithms:
 |[__Network Morphism__](#NetworkMorphism)|Network Morphism provides functions to automatically search for architecture of deep learning models. Every child network inherits the knowledge from its parent network and morphs into diverse types of networks, including changes of depth, width, and skip-connection. Next, it estimates the value of a child network using the historic architecture and metric pairs. Then it selects the most promising one to train. [Reference Paper](https://arxiv.org/abs/1806.10282)|
 |[__Metis Tuner__](#MetisTuner)|Metis offers the following benefits when it comes to tuning parameters: While most tools only predict the optimal configuration, Metis gives you two outputs: (a) current prediction of optimal configuration, and (b) suggestion for the next trial. No more guesswork. While most tools assume training datasets do not have noisy data, Metis actually tells you if you need to re-sample a particular hyper-parameter. [Reference Paper](https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/)|
 |[__BOHB__](#BOHB)|BOHB is a follow-up work of Hyperband. It targets the weakness of Hyperband that new configurations are generated randomly without leveraging finished trials. For the name BOHB, HB means Hyperband, BO means Byesian Optimization. BOHB leverages finished trials by building multiple TPE models, a proportion of new configurations are generated through these models. [Reference Paper](https://arxiv.org/abs/1807.01774)|


please book a meeting to review your code.

Also could you give some experiment result in HPO.md, so that we could compare with other Tuner~

xuehui1991 · 2019-06-23T01:26:55Z

docs/en_US/BuiltinTuner.md

+
+* **optimize_mode** (*'maximize' or 'minimize', optional, default = 'maximize'*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.
+* **utility** (*'ei', 'ucb' or 'poi', optional, default = 'ei'*) - The kind of utility function. 'ei', 'ucb' and 'poi' corresponds to 'Expected Improvement', 'Upper Confidence Bound' and 'Probability of Improvement' respectively. 
+* **kappa** (*float, optional, default = 5*) - Used by utility function 'ucb'. The bigger `kappa` is, the more the tuner will be exploratory.


I think part of them are optional classArg(if they have default value)...

All of them are optimal. Normally no need to change them except for "optimize_mode" .

xuehui1991 · 2019-06-23T01:27:37Z

docs/en_US/BuiltinTuner.md

+
+> Builtin Tuner Name: **GPTuner**
+
+Note that the only acceptable types of search space are `choice`, `quniform`, `uniform` and `randint`.


why cannot support other type?

loguniform, qloguniform added. Types like normal are not supported here since a limited bound is needed.

xuehui1991 · 2019-06-23T01:30:37Z

docs/en_US/BuiltinTuner.md

+
+**Suggested scenario**
+
+GP Tuner is uses a proxy optimization problem (finding the maximum of the acquisition function) that, albeit still a hard problem, is cheaper (in the computational sense) and common tools can be employed. Therefore GP Tuner is most adequate for situations where sampling the function to be optimized is a very expensive endeavor. GP Tuner has a computationoal cost that grows at *O(N^3)* due to the requirement of inverting the Gram matrix. [Detailed Description](./GPTuner.md)


"Therefore GP Tuner is most adequate for situations where sampling the function to be optimized is a very expensive endeavor", can you explain more?

Explanation added.

xuehui1991 · 2019-06-23T01:32:56Z

docs/en_US/GPTuner.md

+
+Bayesian optimization works by constructing a posterior distribution of functions (Gaussian Process here) that best describes the function you want to optimize. As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in parameter space are worth exploring and which are not.
+
+GP Tuner is designed to minimize/maximize the number of steps required to find a combination of parameters that are close to the optimal combination. To do so, this method uses a proxy optimization problem (finding the maximum of the acquisition function) that, albeit still a hard problem, is cheaper (in the computational sense) and common tools can be employed. Therefore Bayesian Optimization is most adequate for situations where sampling the function to be optimized is a very expensive endeavor.


Any reference paper?

paper link added

xuehui1991 · 2019-06-23T01:35:08Z

src/nni_manager/yarn.lock

  dependencies:
-    tsutils "^2.12.1"
+    tsutils "^2.27.2 <2.29.0"


why change here?

I've undo the commit of this file and added it to .gitignore

xuehui1991 · 2019-06-23T01:39:40Z

src/webui/yarn.lock

@@ -1898,6 +1898,13 @@ copy-descriptor@^0.1.0:
  version "0.1.1"
  resolved "https://registry.yarnpkg.com/copy-descriptor/-/copy-descriptor-0.1.1.tgz#676f6eb3c39997c2ee1ac3a924fd6124748f578d"



Please add at least one config-test for a new tuner.

please discard change for yarn.lock if no specific dependencies are added.

config-test added

xuehui1991 · 2019-06-23T01:40:07Z

src/sdk/pynni/nni/gp_tuner/util.py

+            warnings.simplefilter("ignore")
+            mean, std = gp.predict(x, return_std=True)
+
+        z = (mean - y_max - xi)/std


does any possible std == 0.0?

Tried to predict a known configuration with GP Regressor, the std is not zero.

xuehui1991 · 2019-06-23T01:41:18Z

docs/en_US/BuiltinTuner.md

+**Requirement of classArg**
+
+* **optimize_mode** (*'maximize' or 'minimize', optional, default = 'maximize'*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.
+* **utility** (*'ei', 'ucb' or 'poi', optional, default = 'ei'*) - The kind of utility function. 'ei', 'ucb' and 'poi' corresponds to 'Expected Improvement', 'Upper Confidence Bound' and 'Probability of Improvement' respectively. 


how to select these choice?

There is no fixed rule for choosing utility function since the block-box function to be optimized varies. Normally 'ei' is good choice who balances exploration and exploitation well. I think it interesting to expose theses choices to users who are interested in the tuning algorithm.

leckie-chn · 2019-06-24T07:37:18Z

pylintrc

@@ -15,7 +15,8 @@ max-attributes=15
 const-naming-style=any

 disable=duplicate-code,
-        super-init-not-called
+        super-init-not-called,
+        cell-var-from-loop


any particular reason to add this rule?

Just to silent a pylint warning "cell variable define in loop". The warning is not reasonable. The code which got this warning is in sdk/pynni/nni/gu_tuner/util.py, line 41

If you really want to get rid of this warning, pls add comments to disable it. Not here in pylint.

leckie-chn · 2019-06-24T07:38:21Z

src/nni_manager/yarn.lock

@@ -161,9 +161,10 @@
  version "10.5.2"


@chicm-ms pls take a look. Seems that we should not update this file in this commit.

@suiguoxin pls drop updates of this file.

I've undo the commit of this file and added it to .gitignore

leckie-chn · 2019-06-24T07:44:34Z

docs/en_US/CommunitySharings/HpoComparision.md

@@ -98,6 +98,7 @@ The total search space is 1,204,224, we set the number of maximum trial to 1000.
 | HyperBand     |0.414065|0.415222|0.417628|
 | HyperBand     |0.416807|0.417549|0.418828|
 | HyperBand     |0.415550|0.415977|0.417186|
+| GP            |0.414353|0.418563|0.420263|


seems that there's many hyper-params in GP tuner, pls consider adding experiments for them.

Also please put three times result here.

Conf with full hyper-parameter are added in test-config.
Two more times results are added.

leckie-chn · 2019-06-24T07:45:51Z

src/nni_manager/yarn.lock

@@ -161,9 +161,10 @@
  version "10.5.2"


@suiguoxin pls drop updates of this file.

leckie-chn · 2019-06-24T07:48:35Z

src/sdk/pynni/nni/gp_tuner/util.py

+        return mean + kappa * std
+
+    @staticmethod
+    def _ei(x, gp, y_max, xi):


Is there any other code duplication for calc EI? After all there are other tuners that are relying on calculation of EI

leckie-chn · 2019-06-24T07:49:31Z

src/webui/yarn.lock

@@ -1898,6 +1898,13 @@ copy-descriptor@^0.1.0:
  version "0.1.1"
  resolved "https://registry.yarnpkg.com/copy-descriptor/-/copy-descriptor-0.1.1.tgz#676f6eb3c39997c2ee1ac3a924fd6124748f578d"



please discard change for yarn.lock if no specific dependencies are added.

leckie-chn · 2019-06-25T03:14:47Z

pylintrc

@@ -15,7 +15,8 @@ max-attributes=15
 const-naming-style=any

 disable=duplicate-code,
-        super-init-not-called
+        super-init-not-called,
+        cell-var-from-loop


If you really want to get rid of this warning, pls add comments to disable it. Not here in pylint.

leckie-chn · 2019-06-25T03:15:40Z

src/sdk/pynni/nni/gp_tuner/util.py

+        if _type == "choice":
+            # Find the closest integer in the array, vals_bounds
+            vals_new.append(
+                min(bound['_value'], key=lambda x: abs(x - vals[i])))


Actually this is a potential bug. Similar issue: https://stackoverflow.com/questions/25314547/cell-var-from-loop-warning-from-pylint pls consider fix it or disable the warning by comments.

chicm-ms · 2019-06-25T08:12:15Z

src/nni_manager/yarn.lock

@@ -2948,4 +2948,4 @@ yargs@11.1.0:

 yn@^2.0.0:
  version "2.0.0"
-  resolved "https://registry.yarnpkg.com/yn/-/yn-2.0.0.tgz#e5adabc8acf408f6385fc76495684c88e6af689a"
+  resolved "https://registry.yarnpkg.com/yn/-/yn-2.0.0.tgz#e5adabc8acf408f6385fc76495684c88e6af689a"


Could you remove this change on yarn.lock?

chicm-ms · 2019-06-25T08:13:21Z

.gitignore

@@ -68,4 +68,4 @@ __pycache__
 build
 *.egg-info

-.vscode
+.vscode


Could you please remove this change from this PR?

chicm-ms · 2019-06-25T08:17:53Z

src/webui/yarn.lock

@@ -8739,4 +8739,4 @@ yargs@~3.10.0:

 zrender@4.0.4:
  version "4.0.4"
-  resolved "https://registry.yarnpkg.com/zrender/-/zrender-4.0.4.tgz#910e60d888f00c9599073f23758dd23345fe48fd"
+  resolved "https://registry.yarnpkg.com/zrender/-/zrender-4.0.4.tgz#910e60d888f00c9599073f23758dd23345fe48fd"


Could you please remove this change?

suiguoxin added 13 commits May 28, 2019 13:37

fix link err in docs

6099fb4

add spaces

44cf69c

re-organise links for detailed descriptions of the tuners and accesso…

3f87626

…rs; fix link err in HpoComparision.md

add in-page link by change .md to .html

b92c4ab

Merge branch 'master' of git://github.com/microsoft/nni

3be8892

delete #section from cross-file links to make links work in both read…

0717988

…thedocs and github docs

Merge branch 'master' of git://github.com/microsoft/nni

db20820

gp_tuner init from fmfn's repo

21725f9

fix params bug by adding float>int transition

e35fa2b

add optimal choices; support randint&quniform type; add doc

17be796

refine doc and code

3906b34

change mnist yml comments

b776d7e

typo fix

dbac6ed

xuehui1991 reviewed Jun 23, 2019

View reviewed changes

suiguoxin added 4 commits June 23, 2019 14:08

fix val err

df3952a

fix minimize mode err

37f4b12

add config test and Hpo result

56a1575

Merge branch 'master' of https://github.com/microsoft/nni

ba8dccd

leckie-chn self-requested a review June 24, 2019 02:50

scarlett2018 added this to the June 2019 Release milestone Jun 24, 2019

scarlett2018 mentioned this pull request Jun 24, 2019

Iteration Plan for June 2019 #1142

Closed

20 tasks

leckie-chn reviewed Jun 24, 2019

View reviewed changes

suiguoxin added 4 commits June 24, 2019 21:02

support quniform type; update doc; update test config

8cd50e3

update doc

942f519

un-commit changed in yarn.lock

ce4906f

fix optimize mode bug

a0d6cd1

suiguoxin added 3 commits June 25, 2019 10:46

optimize mode

10df680

optimize mode

c5f3da0

reset pylint, gitignore

77c9547

leckie-chn approved these changes Jun 25, 2019

View reviewed changes

chicm-ms reviewed Jun 25, 2019

View reviewed changes

suiguoxin added 2 commits June 25, 2019 16:22

Merge branch 'master' of git://github.com/microsoft/nni

d6febf2

revert .gitignore yarn.lock

66ba114

scarlett2018 approved these changes Jun 25, 2019

View reviewed changes

leckie-chn merged commit a587648 into microsoft:master Jun 25, 2019

scarlett2018 mentioned this pull request Jun 26, 2019

June 2019 Release Endgame Plan #1161

Closed

23 tasks


		> Builtin Tuner Name: GPTuner

		Note that the only acceptable types of search space are `choice`, `quniform`, `uniform` and `randint`.


		Suggested scenario

		GP Tuner is uses a proxy optimization problem (finding the maximum of the acquisition function) that, albeit still a hard problem, is cheaper (in the computational sense) and common tools can be employed. Therefore GP Tuner is most adequate for situations where sampling the function to be optimized is a very expensive endeavor. GP Tuner has a computationoal cost that grows at O(N^3) due to the requirement of inverting the Gram matrix. [Detailed Description](./GPTuner.md)


		Bayesian optimization works by constructing a posterior distribution of functions (Gaussian Process here) that best describes the function you want to optimize. As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in parameter space are worth exploring and which are not.

		GP Tuner is designed to minimize/maximize the number of steps required to find a combination of parameters that are close to the optimal combination. To do so, this method uses a proxy optimization problem (finding the maximum of the acquisition function) that, albeit still a hard problem, is cheaper (in the computational sense) and common tools can be employed. Therefore Bayesian Optimization is most adequate for situations where sampling the function to be optimized is a very expensive endeavor.

		@@ -1898,6 +1898,13 @@ copy-descriptor@^0.1.0:
		version "0.1.1"
		resolved "https://registry.yarnpkg.com/copy-descriptor/-/copy-descriptor-0.1.1.tgz#676f6eb3c39997c2ee1ac3a924fd6124748f578d"

Add GP Tuner and related doc #1191

Add GP Tuner and related doc #1191

Conversation

suiguoxin commented Jun 21, 2019

xuehui1991 Jun 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuehui1991 Jun 23, 2019 •

edited

Loading