-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add HyperBand #787
feat: Add HyperBand #787
Conversation
/hold |
/hold cancel |
/retest |
2 similar comments
/retest |
/retest |
@gaocegege this PR also needs a rebase |
Gotcha |
@@ -19,7 +19,7 @@ spec: | |||
- name: "eta" | |||
value: "3" | |||
- name: "r_l" | |||
value: "9" | |||
value: "2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, or the validation will fail because the hyperband algorithm will validate parallelTrialCount according to the r_l
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How was it working in v1alpha2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In e2e test, we will set the parallel trial count to 2. Thus we will get the error parallel trial count is less than 9
bacause of the r_l. In v1alpha2 we do not have e2e test for hyperband. Thus we do not have the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks for explanation.
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
// Algorithm settings in suggestion will overwrite the settings in experiment. | ||
filledE := e.DeepCopy() | ||
appendAlgorithmSettingsFromSuggestion(filledE, instance.Spec.AlgorithmSpec) | ||
experiment := g.ConvertExperiment(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/g.ConvertExperiment(e)/g.ConvertExperiment(filledE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good find. I will update it when CI is finished. Want to see the result of CI first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about not introducing filledE, and update g.ConvertExperiment(e) to g.ConvertExperiment(e, instance.Spec.AlgorithmSpec)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think code here is more readable. Or we will update the convertAlgortihmSettings. It is hard to read, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok for me
Signed-off-by: Ce Gao <gaoce@caicloud.io>
/retest |
@@ -83,10 +91,11 @@ func (g *General) SyncAssignments( | |||
}) | |||
} | |||
|
|||
// TODO(gaocegege): Set algorithm settings | |||
updateAlgorithmSettings(instance, response.Algorithm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec is not updated in Reconcile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, gotcha
pkg/controller.v1alpha3/suggestion/suggestionclient/algorithm_settings.go
Outdated
Show resolved
Hide resolved
pkg/controller.v1alpha3/suggestion/suggestionclient/algorithm_settings.go
Show resolved
Hide resolved
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
/hold |
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
pkg/controller.v1alpha3/suggestion/suggestion_controller_status.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Ce Gao <gaoce@caicloud.io>
Signed-off-by: Ce Gao <gaoce@caicloud.io>
/hold cancel |
CI errored out because of quote limit. We have to wait till previous one completes. |
Yeah, I saw it. |
/retest |
var paralleltrials int32 = 2 | ||
exp.Spec.MaxTrialCount = &maxtrials | ||
exp.Spec.ParallelTrialCount = ¶lleltrials | ||
if exp.Spec.Algorithm.AlgorithmName != "hyperband" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to fix this check later.
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnugeorge The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -41,7 +42,7 @@ spec: | |||
- ftrl | |||
- name: --num-epochs | |||
parametertype: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaocegege, is the yaml parser case sensitive? Here, parametertype
doesn't have the t
capitalized like elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the find. I will have a look. Actually, @johnugeorge found that the e2e hyperband test is finished quickly. Not sure if it is caused by this problem.
Signed-off-by: Ce Gao gaoce@caicloud.io
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note:
This change is