Learnings from NNI #126

nginyc · 2019-06-21T20:40:20Z

As our project is similar to NNI by Microsoft, I thought it might be good to study how they're doing things, compare it how we're doing things, and derive some learnings.

How model's hyperparameter search space is defined

NNI calls a set of knobs "Configuration", the knob config "Search Space"
In NNI, "Search Space" is defined as JSON on separate file
In Rafiki, Knob Config is defined in Python with typed "Knob" classes as part of the model code
My opinion:
- Rename "knob config" -> "knob space" for clarity?
- More flexible & powerful to configure dynamically with Python
- Edits to search space are simpler if written in Python alongside model code in the same file
- Submitting a separate configuration file can be more troublesome
- Using JSON is more convenient if hyperparameter search space is to be tweaked often, independently of model code

How model developers configure the AutoML algorithm

In NNI, model developers need to configure an "Experiment" in a YAML file
- Choice of configuration: which "Tuner" to use, configuration for that chosen tuner, max no. of trials, max training duration, no. of GPUs, platform to train (e.g. local machine, Kubernetes)
- A single model is trained for each experiment
In NNI, pointers to datasets are hard-coded in model code, and there's no concept of "task"
In Rafiki, application developers configure a train job by simply submitting a task, a budget, datasets and maybe model IDs in Python
- Rafiki matches task to a set of models, and trains these multiple models concurrently
- Rafiki manages provisioning of training platform & GPUs
- Rafiki automatically selects & configures which advisor to use based on hyperparameter search space
Due to differences between designs of Rafiki and NNI:
- In Rafiki, a non-expert application developer initiates training instead of a model developer, so configuring training should be non-technical and, as much as possible, abstract away complexity of model selection & tuning configuration
- Rafiki is designed to be more end-to-end, as a ML-as-a-service
My opinion:
- As with NNI, should model developers in Rafiki be able to optionally configure how their models are tuned e.g. which advisor to use, configuration of advisor?
  - Allows model developers to select more appropriate / empirically better AutoML algorithms for their models, but more burden on them
  - Maybe with another static class method
- Current abstraction & definition of budget in Rafiki is appropriate

How the model interfaces with the AutoML system

In NNI, AutoML system calls upon model code by simply running main Python file (i.e. triggers main method). Supports a directory of Python files
In Rafiki, system calls upon model code by importing a given class from a single Python file, then appropriately running methods on instances of that class
In NNI, model code calls upon AutoML system by importing nni module and calling e.g. nni.get_next_parameter() to get hyperparameters for trial, nni.report_final_result(metrics) to pass final metrics of a trial that is interpreted by the tuner
In Rafiki, model code imports utils module and calls e.g. utils.dataset..., utils.logger... for helper/logging methods. Return values to e.g. evaluate(dataset) pass final score back to system
My opinion:
- NNI's interface maximises portability of existing model code - no need to rewrite into a class definition like in Rafiki
- NNI's interface more loosely couples model code & AutoML system
- But Rafiki's well-defined model class gives more flexibility/power to tuning algorithms (e.g. better control flow), and is more appropriate for our design
  - Unlike NNI, Rafiki needs to support predictions, loading & saving of model parameters
- Consider documenting on how to port existing model code to Rafiki, or brainstorm on tweaking API to improve portability?

How the AutoML system configures the model's training behaviour

NNI configures model's training behaviour just through hyperparameters and its framework for early stopping with the concept of an "Assessor"
- Model code can optionally call nni.report_intermediate_result(metrics) that is interpreted by the assessor, which kills the trial when the intermediate results are poor
- No explicit support and extension for other ways to configure model's training behaviour other than early stopping e.g. loading of shared parameters, using a downscaled model
In Rafiki, we're thinking of configuring a model's training behaviour with PolicyKnob(policy_name) as part of model's knobs, so that model code can switch between different "modes" (e.g. early stop VS don't early stop)
- My opinion: this can support more advanced tuning strategies e.g. we can introduce more policies in the future without changing Rafiki's code

How the AutoML system supports architecture tuning

NNI only currently supports GA-based architecture tuning, where the model code & tuner depend on a custom graph abstraction & architecture space definition as the "Search Space"
- Which may not be general/flexible enough

As in previous section, NNI won't be able formally support an implementation of ENAS as the tuner needs to tell the model code to load shared parameters and switch between "train for 1 epoch" & "just evaluate on a subset of the validation dataset"
For Rafiki, we're thinking of representing architecture as an array of categorical values
- More general (up to model developer to define encoding), but low-level and less "informative" for the architecture tuning algorithm
- E.g.

    l0 = KnobValue(0) # Input layer as input connection
    l1 = KnobValue(1) # Layer 1 as input connection
    l2 = KnobValue(2) # Layer 2 as input connection
    ops = [KnobValue('conv3x3'), KnobValue('conv5x5'), KnobValue('avg_pool'), KnobValue('max_pool')]
    arch_knob = ArchKnob([
        [l0], ops, [l0], ops,                   # To form layer 1, choose input 1, op on input 1, input 2, op on input 2, then combine post-op inputs as preferred                                     
        [l0, l1], ops, [l0, l1], ops,           # To form layer 2, ...
        [l0, l1, l2], ops, [l0, l1, l2], ops,   # To form layer 3, ...
    ])

The text was updated successfully, but these errors were encountered:

nudles · 2019-07-11T03:02:33Z

Thanks for the comparison. I list some comments (not in order)

search space. NNI has another way of defining the search space, which uses annotation. It moves the hyper-parameter definition closer to the use place. The python code can run with and without hyper-parameter tuning.
by making NNI a library, it would be easier for local development and debugging. The running flow is controlled by model developers. Rafiki provides a platform for hyper-parameter search. Rafiki controls the flow. Like map-reduce, the system controls the flow and the developers fill in the code of map and reduce.
it would be good to decouple a system into modular components. We will have resource management, filesystem or datastore, hyper-parameter tuning, inference queueing, etc.
we may not be able to unify the architecture tuning algorithms and hyper-parameter tuning algorithms. E.g., it is difficult to even unify ENAS and DART algorithms..
mlflow and kubeflow are two other projects with the hyper-parameter tuning feature.

nginyc added the question Further information is requested label Jun 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learnings from NNI #126

Learnings from NNI #126

nginyc commented Jun 21, 2019

nudles commented Jul 11, 2019

Learnings from NNI #126

Learnings from NNI #126

Comments

nginyc commented Jun 21, 2019

How model's hyperparameter search space is defined

How model developers configure the AutoML algorithm

How the model interfaces with the AutoML system

How the AutoML system configures the model's training behaviour

How the AutoML system supports architecture tuning

nudles commented Jul 11, 2019