Parameter Optimization #433

ctrl-z-9000-times · 2019-04-30T12:33:28Z

This program finds good parameters for an HTM system.

I wrote this program a while ago, and now I'd like to contribute it.

This is a large addition, and it is not ready yet. I do not recommend reviewing the code, instead just read the documentation at the top of the file.

This program finds good parameters for an HTM system.

breznak · 2019-05-08T17:15:07Z

This program finds good parameters for an HTM system.

hey, this is a huge and much welcome contribution! 💯
It's neeeded for everyday practical usecases, #391 , and Hotgym, where our TM has insufficient representation (I don't know if there's an issue for that).

Sorry, I'll get later to review, even just to read the doc.

Some high-level thoughts about parameter optimization/tuning:

I don't know if this should be in this repo, or rather stand-alone nupic.swarming ?
- that way, other projects can also benefit
- we'd still ensure OUR API is supported.
can we properly transform/describe our model parameters and leave the optimization to 3rd party tools (scikitlearn,...)?
see Numenta Anomaly Benchmark #205 for my comment about OPF support, that does the same thing.
- I guess your code has cleaner implementation
- but OPF has a popular (?) API which we said to support?
if the code is working, I'd like to merge it ASAP, with potential future rework/obsoletion.

py/src/nupic/optimization/ae.py

breznak · 2019-05-08T17:18:51Z

py/src/nupic/optimization/ae.py

+    ExperimentModule.main(parameters=default_parameters, argv=None, verbose=True)
+    Returns (float) performance of parameters, to be maximized.
+
+Usage: $ ae.py [ae-arguments] ExperimentModule.py [experiment-arguments]


can you provide a runnable demo example? That would be worth 1000 words for first-timers wishing to use the code.
MNIST would be an ideal candidate and you've already optimized that code for me once, so you should have the "experiment" files ready.

breznak · 2019-05-08T21:46:08Z

py/src/nupic/optimization/ae.py

+    return best.parameters
+
+
+class GridSearch(object):


do you need to write the hyper param optimization code yourself? It is available in many toolkits?
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

breznak · 2019-05-08T21:46:50Z

py/src/nupic/optimization/ae.py

+import tempfile
+import multiprocessing
+import resource
+import signal # Probably not X-platform ...


would this be a problem? but hey..worst case, the feature will be avail only for certain OSs

I looked this up, This use of signal can be replaced with multithreading.timer which is X-platform.

breznak · 2019-05-13T08:53:14Z

py/src/nupic/optimization/swarming.py

+
+from .nupic.optimization.parameter_set import ParameterSet
+
+class ParticleSwarmOptimizations:


I do really like the recent changes! 👏 , extracting the particular search mechanism away, and adding the new PSO.

extracting the particular search mechanism away

This turned out to be more problematic than I expected. Each search method needs:

Command line arguments

Queue experiments

Collect results, asynchronously from the queuing them

Currently it can only queue experiments. I'm thinking about how to do all of this in a sane way that allows for new methods to plug-in. I might make a new interface class to help with this.

have you considered the 3rd party for param optimization, model selection? And "just" write enabling interface for our params & constructors?

https://www.khronos.org/nnef

I don't think this framework is useful, it's all about sharing trained deep learning networks.

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

Scikit learn appears very limited.

The only search method it has is GridSearch. It does not do swarming.

I also want the parameter optimization code to save all of the outputs of the program. Which scikitlearn doesn't do.

agreed on NNEF

Scikit learn appears very limited.

it was only a first hit I got from memory, these "python parameter optimization frameworks" look interesting:
https://hyperopt.github.io/hyperopt/
https://optuna.org/
I do like your solution, but in the longrun I hope we'll have an interface to a 3rd party project, as there's lots of details which are out of scope for us (same as Serialization, ..)

have an interface to a 3rd party project, as there's lots of details which are out of scope for us

I understand, I will do some more research into existing solutions.

3rd party solutions

I found a great project, it's called nni. It is a framework for optimizing artificial neural networks (the non-biological kind). I read through it and attempted (but ultimately failed) to run the MNIST example with it.

Its open source (MIT license) so we could easily incorporate it into our project.

It's run by microsoft, and they appear to be actively developing it. They accept external PR's and they might even fix issues we have with it.

It is well designed. It's got some neat features too.

All that praise said, I don't think it's ready for mainstream use. It's rough at the edges. The project is about a year old. In a year this could be the defacto standard tool, but today I'd rather keep using my tools which I'm familiar with.

I found a great project, it's called nni.

I like NNI! although the whole AutoML seems overkill for us (we don't need network architecture "evolution", as we have HTM, we "just" need a way for param optimization.

I read through it and attempted (but ultimately failed) to run the MNIST example with it.

ok, let's go with what you have now. I would suggest not "wasting" time developing too many features, as long term we'd like to switch over to an external framework to do the work for us.

The current will be a great improvement, so let's 👍

PS: what have failed running their mnist? Do you think it's not ready yet, or should I try in some later PR?

Ideally, if you can take their (NNI or other suitable) design in mind, and develop the format for "experiment description" what you call it, similar/same, so we can later easily switch?

PS: what have failed running their mnist?

I tried running our MNIST in their framework.

the format for "experiment description"

This is highly dependent on the specifics of the optimization framework.

I will push the work i did for NNI in a branch so that you can play with it.

I think the problem i had is that NNI does not start by trying the default parameters. It spent a considerable amount of time trying parameters which obviously wont work. Swarming has an advantage over TPE (the search method nni defaults to) in that it starts near the default parameter values which are at least sane. The NNI framework has a way to end trials/experiments early if they obviously arent working, but i didnt use it.

dkeeney · 2019-05-13T13:00:40Z

This is really cool. 🥇

ctrl-z-9000-times · 2019-05-15T18:04:13Z

I don't know if this should be in this repo, or rather stand-alone nupic.swarming ?
that way, other projects can also benefit

While i think thats a good idea, i dont want to deal with the logistics of having two repos and a depenency. Maybe for now we can keep this here, and reconsider the location later when we also reconsider the alternatives?

ctrl-z-9000-times · 2019-05-17T15:59:27Z

I finished my review. The documentation still needs work before this can merge.
EDIT: I added some more documentation.
I added a lot of TODO notes for issues/bugs and feature requests.

The process pool that comes with python standard library does not always work very well, and does not give the caller enough control to fix problems with it.

breznak

This looks very good now! 👍
I'm looking forward to using this new feature.

One TODO: I'd like to see an example how this is used on either of hotgym/mnist, if that is not already part of #480 ?

breznak · 2019-05-21T12:12:52Z

py/src/nupic/examples/mnist.py

-    'synPermConnected': 0.422,
-    'synPermInactiveDec': 0.005
+    'boostStrength': 7.80643753517375,
+    'columnDimensions': (35415,),


OT: I wonder how much the scores differ? It's said HTM is/should be more resistant to param changes. These seem like a rather big difference.

IIRC accuracy improved from about 90% to 96%.

breznak · 2019-05-21T12:20:48Z

py/src/nupic/optimization/ae.py

+                    self.save()     # Write the updated Lab Report to file.
+
+
+class Worker(Process):


nit, do you want to split some of these to separate classes?

ctrl-z-9000-times · 2019-05-21T13:02:32Z

Thanks for reviewing this!

Contribute Parameter Optimization module

09cf453

This program finds good parameters for an HTM system.

ctrl-z-9000-times added in_progress tooling feature new feature, extending API labels Apr 30, 2019

ctrl-z-9000-times self-assigned this Apr 30, 2019

breznak mentioned this pull request May 6, 2019

Real-life benchmark: Hotgym example using C++ algorithms #30

Open

39 tasks

breznak reviewed May 8, 2019

View reviewed changes

breznak added the optimization label May 8, 2019

breznak mentioned this pull request May 8, 2019

Add support for frameworks.OPF from python #459

Closed

breznak reviewed May 8, 2019

View reviewed changes

ctrl-z-9000-times added 2 commits May 12, 2019 19:13

AE: Review & reorg.

e9f7c66

Particle Swarm Optimization - WIP

f0e0c98

breznak reviewed May 13, 2019

View reviewed changes

ctrl-z-9000-times added 5 commits May 15, 2019 22:03

AE: Refactored Optimization code.

b1461ae

Merge branch 'master' into ae

3ebb1f2

AE: MNIST Example, scores 92%

5e0d271

AE & Swarming: Review, Refactor, Cleanup

98fb9cb

MNIST python example, used AE program to improve score, 92->94%

3d09a1a

breznak mentioned this pull request May 17, 2019

Hyper-Parameter Optimization Framework - WIP #477

Draft

ctrl-z-9000-times and others added 2 commits May 18, 2019 01:24

AE: Documentation

7c3df28

Merge branch 'master' into ae

0f6c746

ctrl-z-9000-times marked this pull request as ready for review May 18, 2019 05:28

AE Improvements, replaced multiprocessing.pool w/ subprocesses

1119123

The process pool that comes with python standard library does not always work very well, and does not give the caller enough control to fix problems with it.

ctrl-z-9000-times added ready and removed in_progress labels May 19, 2019

ctrl-z-9000-times mentioned this pull request May 21, 2019

Hotgym #480

Merged

breznak approved these changes May 21, 2019

View reviewed changes

breznak merged commit b5e6d95 into master May 21, 2019

breznak deleted the ae branch May 21, 2019 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameter Optimization #433

Parameter Optimization #433

ctrl-z-9000-times commented Apr 30, 2019

breznak commented May 8, 2019

breznak May 8, 2019

breznak May 8, 2019

breznak May 8, 2019

ctrl-z-9000-times May 13, 2019

breznak May 13, 2019

ctrl-z-9000-times May 13, 2019

breznak May 13, 2019

ctrl-z-9000-times May 13, 2019

breznak May 13, 2019

ctrl-z-9000-times May 13, 2019

ctrl-z-9000-times May 15, 2019

breznak May 15, 2019

ctrl-z-9000-times May 15, 2019

ctrl-z-9000-times May 15, 2019

dkeeney commented May 13, 2019

ctrl-z-9000-times commented May 15, 2019

ctrl-z-9000-times commented May 17, 2019 •

edited

Loading

breznak left a comment

breznak May 21, 2019

ctrl-z-9000-times May 21, 2019

breznak May 21, 2019

ctrl-z-9000-times commented May 21, 2019


		from .nupic.optimization.parameter_set import ParameterSet

		class ParticleSwarmOptimizations:

		self.save() # Write the updated Lab Report to file.


		class Worker(Process):

Parameter Optimization #433

Parameter Optimization #433

Conversation

ctrl-z-9000-times commented Apr 30, 2019

breznak commented May 8, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dkeeney commented May 13, 2019

ctrl-z-9000-times commented May 15, 2019

ctrl-z-9000-times commented May 17, 2019 • edited Loading

breznak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ctrl-z-9000-times commented May 21, 2019

ctrl-z-9000-times commented May 17, 2019 •

edited

Loading