Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

additional metric names should not include objective metric name #1873

Closed
henrysecond1 opened this issue May 25, 2022 · 0 comments · Fixed by #1874
Closed

additional metric names should not include objective metric name #1873

henrysecond1 opened this issue May 25, 2022 · 0 comments · Fixed by #1874
Labels

Comments

@henrysecond1
Copy link
Contributor

/kind bug

What steps did you take and what happened:

Deploy an experiment which spec.objective.additionalMetricNames contains spec.objective.objectiveMetricName.

The experiment is created, but you can see nothing on the experiment UI, like below.

  • You can see the log GetLastConditionType failed: Experiment doesn't have any condition in the Katib UI pod.
  • Also in katib controller, you can see the log "admission webhook \"validator.experiment.katib.kubeflow.org\" denied the request: only spec.parallelTrialCount, spec.maxTrialCount and spec.maxFailedTrialCount are editable",.

After some investigation, we found that the experiment is created, but failed to update when the additionalMetricNames contains objectiveMetricName.

Detailed investigation below

This is because of the behavior of the mutating & validation webhook.

If additionalMetricNames contains objectiveMetricName, mutating webhook tries to add objectiveMetricName to metricStrategies like below.

So after the mutation, metric strategies look like below.

    metricStrategies:
    - name: Validation-accuracy
      value: max
    - name: Train-accuracy
      value: max
    - name: Validation-accuracy
      value: max

Another mutation happens when the controller update finalizer

However, it is not allowed to update metricsStrategies when oldInst is not nil

So the updating experiment is failed and status of experiment will be never updated

What did you expect to happen:

The experiment should not be created when additionalMetricNames contains objectiveMetricName.

Environment:

  • Katib version (check the Katib controller image version): v0.13

Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant