Reduce number of initialization evaluations if external data is supplied #137
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, it is not possible to skip the initialization step of the Ax generators. This is not convenient, because it means that the first trials suggested by the generator will always be quasi-random, even if the user supplied external data to initialize the model. This PR fixes this issue by reducing the number of Sobol trials if external data is provided by the user.
There were several options to implement this:
n_init=0
: In this case, we would simply not add a Sobol step to the AxClient. However, this is not safe in all cases, because the BO step requires some initial data to be able to start. Also, it you want to resume a previous exploration, you would need to change the input script to setn_init=0
to avoid rerunning the Sobol step.n_init>0
, and only reduce the number of Sobol trials once external data is given: This solves the issue with option 1. If a user resumes an exploration by running the same original script, the optimization will continue from where it had stopped, without generating Sobol trials again. It also maintains compatibility with the workflow mentioned in 2 (the user can get a reference to the AxClient in the input script as soon as the generator is initialized). This approach is also more flexible, as it does not require the user to decide in advance whether external data will be given. This is useful for interactive optimization on a notebook (e.g., withn_init=6
, the user can first run 2 Sobol trials, then add an external one, then continue with the remaining 3 Sobol trials). This was the implemented option.