Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go Templates are hard to work with programmatically #857

Closed
jlewi opened this issue Oct 4, 2019 · 10 comments
Closed

Go Templates are hard to work with programmatically #857

jlewi opened this issue Oct 4, 2019 · 10 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Oct 4, 2019

/kind feature

Describe the solution you'd like
Katib currently uses go templates to define the training jobs. In the random example

The training job template is set to

          apiVersion: batch/v1
          kind: Job
          metadata:
            name: {{.Trial}}
            namespace: {{.NameSpace}}
          spec:
            template:
              spec:
                containers:
                - name: {{.Trial}}
                  image: katib/mxnet-mnist-example
                  command:
                  - "python"
                  - "/mxnet/example/image-classification/train_mnist.py"
                  - "--batch-size=64"
                  {{- with .HyperParameters}}
                  {{- range .}}
                  - "{{.Name}}={{.Value}}"
                  {{- end}}
                  {{- end}}
                restartPolicy: Never

The problem with using go templates is that the manifest is no longer valid yaml. So you can't parse the YAML spec in python and then manipulate that object programmatically.

Consider the following example

  1. We are using a notebook (e.g this one) to launch katib jobs
  2. We are using fairing to build the docker images
  3. We would like to programmatically set the docker image in our training jobs for katib to the newly built image with fairing

Typically we would just define the K8s object a python dictionary and then we can easily set the image.

However, because the training job in the case of katib is a go template we can no longer parse it into a dictionary, manipulate it and then serialize it to YAML.

It might be worth looking at Tekton to see how it does parameter substitution since Tekton objects are valid YAML.

/cc @johnugeorge @richardsliu

@johnugeorge
Copy link
Member

@jlewi I think, Parameter substitutions can be worked out by exposing params like image outside the template. However, GO template necessity comes from the need of supporting multiple job types - batch job, Tfjob, PyTorchJob etc

@jlewi
Copy link
Contributor Author

jlewi commented Oct 5, 2019

@johnugeorge why does supporting other job types necessitate the use of GO Templates? Could we come up with a substitution style that down't neccesarily break with yaml parsing; maybe this is already doable and its just a matter of document it?

For example suppose the spec was

apiVersion: batch/v1
kind: Job
metadata:
  name: "{{.Trial}}"
  namespace: "{{.NameSpace}}"
spec:
  template:
    spec:
      containers:
      - name: "{{.Trial}}"
        image: katib/mxnet-mnist-example
        command:
        - "python"
        - "/mxnet/example/image-classification/train_mnist.py"
        - "--{{.hp1.name}}={{.hp1.value}}"
        - "--{{.hp2.name}}={{.hp2.value}}"
      restartPolicy: Never

This string is valid YAML; e.g. you can do

job = yaml.load(raw_spec)
job["spec"]["image"] = ....

So its easy to programmatically modify the yaml and then dump the yaml before setting the template.

Maybe that's already doable today? I guess the question is can a user refer by name to the value of a hyperparameter as opposed to doing a loop over the hyperparameters ?

@johnugeorge
Copy link
Member

I get what you are trying to do. I was trying to say that I couldn't find any alternative other than keeping trial spec as string(though not the best user experience) .

With respect to your suggestion to make spec as yaml, I feel that it is possible to implement it. Currently, it is bit tricky because hyper parameters is variable number of key-value pairs. We will need to have our own parameter substitution mechanism(like Tekton) to support the loop over them.

We can figure out a solution in the beta api release

@jlewi
Copy link
Contributor Author

jlewi commented Oct 8, 2019

@johnugeorge Do go templates allow referring to hyperparameters by name e.g. can I do something like this

        - "--{{.hp1.name}}={{.hp1.value}}"

So in my go template I don't need to iterate over the hyperameters.

As the author of the Experiment I already have to know the names of my hyper parameters because I have to list them in my Experiment. So If I can explicitly to the hyparameters by name in the go template then I think its possible to still use go templates and produce valid YAML.

@hougangliu
Copy link
Member

@gaocegege @johnugeorge
for now, we can use concrete hyperparameter and value pairs to replace go template counterpart in trial by argument or environment var.
we can enhance it by introducing a hyperparamater-inject-kind field (now its value can be argument and "env-var") and fill hyperparameter and value pairs to worker container when generating worker job automatically based on it.
In future, we can support other ways to hyperparamater-inject-kind value to inject hyperparamater pairs.

@gaocegege
Copy link
Member

SGTM. Actually, I have researched if we could inject hyperparameters by rewriting Python AST.

https://github.com/caicloud/katib/blob/056cc36515052156fa2ada681d7ff487441eedc0/py/hypertransformer/README.md

NNI follows this way. I think it is better to be extensible. We could support multiples approaches.

@hougangliu
Copy link
Member

yes, AST can be one another hyperparamater-inject-kind, the hyperparamater-inject-kind must keep consistent with user source code.

@andreyvelich
Copy link
Member

We can close this, we are using new Trial template now: #1208

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/katib 0.51

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

1 similar comment
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/katib 0.51

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants