Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial attempt to incorporate MultiTask GPs #353

Merged
merged 18 commits into from
Apr 12, 2024

Conversation

jpfolch
Copy link
Contributor

@jpfolch jpfolch commented Feb 21, 2024

Initial attempt to incorporate MultiTask GPs, with long term plan of incorporating the multi-fidelity algorithm as described in here.

Current issues:

  • When predicting the posterior, BoFire by default returns the posterior mean and a co-variance which included the observation noise (see this line where observation_noise = True). However, BoTorch's MultiTaskGP does not support adding observation noise yet and returns NotImplementedError: Specifying observation noise is not yet supported by MultiTaskGP. (a) A work-around would be to redefine the MultiTaskGPSurrogate to have it set to False, however this would lead to inconsistencies in model usage within BoFire. (b) Maybe a new flag can be incorporated into BoFire to include the option of predicting observation noise. (c) For MultiTaskGPs I could implement the new posterior myself, by adding the observation noise to the posterior covariances manually; and then simplify the code once BoTorch starts supporting observation noise.

  • I am unable to use the dump functionality, when saving the BoTorch model using torch.save(self.model, buffer) I get the error AttributeError: Can't pickle local object 'IndexKernel.__init__.<locals>.<lambda>'. Still not sure if this is an issue with my local code, with GPyTorch's Index Kernel, or something else entirely.

  • Need some way of specifying which column has the task_id for each observation. Currently done by calling the variable 'fid' (for fidelity, but easily changed). Need to decide whether we specify which column has the task_ids by specific name or with some other method.

  • LKJ Prior (a matrix-valued prior on the inter-task correlations) depends on the number of tasks, which is unknown until the model is initialized. Currently I am setting it as a dummy n_tasks = 1 in the prior and then changing the attribute value when initializing the surrogate, it works but perhaps there is a better way of going about it.

Future work:

  • Need to incorporate a likelihood with heteroskedastic noise that allows for different noise levels in each task.

  • Once MultiTaskGPs are working, implement a multi-fidelity strategy for querying.

@jduerholt jduerholt self-requested a review February 22, 2024 08:38
Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jpfolch,

thank you very much for your PR. MultiTask and MultiFidelityGPs are something which I wanted to have in BoFire for already a long time, but never find time for!

For the start I am focusing on your point three (how to encode the task info), as I have already an idea for this ;)

My proposal is to introduce a new input feature called TaskInput. This TaskInput should inherit from the CategoricalInput feature and should have an additional attribute called fidelities of type List[int] in which one can assign different fidelity levels to the individual tasks/categories. I would add a validator that checks that one always has a step size of 1 between different fidelities and that it starts with zero.

For example, having a TaskInput with three different tasks/categories named process_1, process_2 and process_3 the following fidelities would be valid:

  • [0,0,0]: all have the same fidelity level
  • [0,0,1]: process_3 has the highest fidelity

Via the allowed attribute of the TaskInput one can then define which tasks can be proposed by the optimizer. As the TaskInput would be a subclass of the CategoricalInput, one has to setup the encoding for the GP properly. Currently, CategoricalInputs are alway One Hot encoded, either one gives the option for ordinal encoding for TaskInputss or one does still the OneHot encoding and uses OneHotToNumeric input transform for the TaskInput as here: https://github.com/experimental-design/bofire/blob/main/bofire/surrogates/mixed_single_task_gp.py which would be from my perspective the most elegant way, but let me know what you think ;)

In addtion, we should add a validator for the Inputs object to forbid more than one TaskInput.

What do you think? I will answer your other points over the weekend (hopefully).

Best,

Johannes

@@ -25,3 +29,6 @@
MBO_LENGTHCALE_PRIOR = partial(GammaPrior, concentration=2.0, rate=0.2)
MBO_NOISE_PRIOR = partial(GammaPrior, concentration=2.0, rate=4.0)
MBO_OUTPUTSCALE_PRIOR = partial(GammaPrior, concentration=2.0, rate=4.0)
MBO_LKJ_PRIOR = partial(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for two types here, if they are the same, the two different types are due to historic reasons. You can just create a LKJ_PIOR methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And perhaps we should make there n_task as non-default argument ...

@@ -0,0 +1,51 @@
import json

from pydantic import parse_obj_as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should all go into the test suite under botorch.tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plan is to remove it completely. It just there for now since it reproduces the errors I wrote about.

@jpfolch
Copy link
Contributor Author

jpfolch commented Feb 23, 2024

And I really like the TaskInput solution. I'll get started on it and push the changes.

@jduerholt
Copy link
Contributor

And I really like the TaskInput solution. I'll get started on it and push the changes.

Great, I will think in the meantime about the other open questions.

from bofire.data_models.features.api import DiscreteInput


class TaskInput(DiscreteInput):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short question: why are you basing it on the DiscreteInput and not on the CategoricalInput? I would prefer the CategoricalInput, as it is more expressive and the Task is somehow more a categorical quantity than a discrete one, as the DiscreteInput always asumes an ordinal ordering of its values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it and here is my logic:

(a) The task themselves do not have any ordering

(b) But the GP expects an ordering of the tasks corresponding to the location of each specific task in the IndexKernel function, i.e. inputs to kernel are k((x', i), (x, j)) so the task input values should always correspond to values i, j in [0, ..., n_tasks - 1].

This means that all we need to do to initialize the object InputTasks is to select the number of tasks. I am happy to change this though, and we can just encode the Categorical variables before feeding it to the GP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now using CategoricalInput as a base

n_tasks: int
fidelities: List[int]

@validator("fidelities")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pydantic 1 style, we are using pydantic 2, so you should use field_validator


class TaskInput(DiscreteInput):
type: Literal["TaskInput"] = "TaskInput"
n_tasks: int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_tasks is not necessary if you base it on the CategoricalInput as it would be just the number of categories.

@jpfolch
Copy link
Contributor Author

jpfolch commented Feb 27, 2024

I've now got an implementation of TaskInputs inheriting from CategoricalVariable and it seems to be working fine when it comes to training the model and fitting. A few issues remain:

  1. Using OneHotToNumeric has to be done outside BoTorch's model since the first line of calling the posterior on MultiTaskGPytorchModel is includes_task_feature = X.shape[-1] == self.num_non_task_features + 1 to check if the task features are included in the X (i.e. fully expecting the task_id to be a singular values in a column). It does not apply the input transform until after this, so under one-hot-encoding the check fails and the model returns predictions for all tasks for all inputs. Two ways of fixing this (a) try to fix within BoTorch and see if they accept the changes, or (b) change the transform within BoFire, or (c) using transform before calling posterior and before setting the training_data (current code is doing this).

  2. I am running into issues with pydantic and serialization. When running surrogate_data.model_dump_json() I get the following warning: Expected Union[definition-ref, definition-ref, definition-ref, definition-ref, definition-ref, definition-ref, definition-ref, definition-ref] but got TaskInput - serialized value may not be as expected. I think this is the causing the following problem where parse_obj_as(MultiTaskGPSurrogate, json.loads(jspec)) fails to run as it fails verification by giving the wrong arguments to the wrong classes. Any help on how to fix this would be appreciated.

@jduerholt
Copy link
Contributor

jduerholt commented Feb 28, 2024

Hi @jpfolch,

as already discussed, you can find a proper implementation for the TaskInput feature here #360 which should be merged soon. Just use this one.

Regarding your point with the one-hot input transform: I did not know this and this is a pity. Including it into botorch will need some time, but your solution will not work with an optimizer as the optimizer will just call posterior. For this reason, we have to use an OrdinalEncoding for the categorical TaskInput.

This is implemented here:

def to_ordinal_encoding(self, values: pd.Series) -> pd.Series:

The actual encoding taken into account by the model is based on the attribute input_preprocessing_specs which is validated/generated here:

@field_validator("input_preprocessing_specs")

For the MultiTaskGP we then have to overwrite it in a way that we assign CategoricalEncodingEnum.ORDINAL for the task feature. If you do this, you should already be able to use the model. For using it in optimization, we then also have to so some adustments. But this will not be a problem.

So, I would recomment to merge main in and change the model to ORDINAL Encoding for the task feature. Note that in standard models, it should still use ONEHOT. You will also need to adjust the input scaler to ignore the task feature, so that it stays as integer.

Best,

Johannes

@jpfolch
Copy link
Contributor Author

jpfolch commented Feb 28, 2024

Perfect, thanks! I will get started on fixing the encoding for TaskInputs.

Regarding not being able to pickle Index kernels, @TobyBoyne found the bug within GPyTorch and submitted a PR which has been approved, so it should be fixed eventually, but we will need a temporary fix for the next few months at least.

@TobyBoyne
Copy link
Collaborator

Hey, while you're waiting for the patch to hit GPyTorch, you can fix this in BoFire by manually registering the prior. I've demonstrated this in TobyBoyne@f9df6b0 , just copy this change over to your code and you will be able to pickle the kernel :)

@jduerholt
Copy link
Contributor

Hey @jpfolch,

I was curious to know what the botorch guys think about changing the order of operations regarding the input transforms in the posterior call so I created this issue: pytorch/botorch#2232

Let's see ;)

Best,

Johannes

@jpfolch
Copy link
Contributor Author

jpfolch commented Mar 5, 2024

Latest commit added the missing validations for the model, and added testing. Overall the multi-task model seems to work now. Only problem I am running into is that the test test_MultiTaskGPModel will fail randomly (this is independent of the test parameters). In particular, the problem happens when calling fit_gpytoch_mll and the error thrown is RunTimeError: Must provide inverse transform to be able to sample from prior, I will try to investigate further in the next few days.

@jduerholt
Copy link
Contributor

jduerholt commented Mar 5, 2024

Only problem I am running into is that the test test_MultiTaskGPModel will fail randomly (this is independent of the test parameters). In particular, the problem happens when calling fit_gpytoch_mll and the error thrown is RunTimeError: Must provide inverse transform to be able to sample from prior, I will try to investigate further in the next few days.

I have not yet looked at your additions, but do you think this is a bofire or a botorch problem? I will have a look at your changes tmr.

@jduerholt
Copy link
Contributor

Hi @jpfolch,

this looks good overall. I will do a more thorough review as soon as you request it and you know more about the issue with the fitting. To test serialization and desirilization for the added datamodels (the new gp and the new prior), please add example configs in tests/bofire/data_models/specs/surrogates.py and tests/bofire/data_models/specs/priors.py. Everything what is registered there will be automatically tested for serialization. With add_invalid, you can add invalid specs to test your custom validators.

Best,

Johannes

@jpfolch
Copy link
Contributor Author

jpfolch commented Mar 11, 2024

Sorry, I haven't been able to work on this due to other commitments, but I should have more time now. Regarding the fit_gpytroch_mll problem. I am still unsure what is causing it, but I think it could be a gpytorch problem based on the following two issues: pytorch/botorch#1323 and pytorch/botorch#1860.

The best solution I have come up with, is to drop the LKJ prior all together until the issue is fixed, and use MultiTaskGPs without a prior on the task covariances.

@jpfolch
Copy link
Contributor Author

jpfolch commented Mar 12, 2024

Latest commit adds the example configs to tests/bofire/data_models/specs/surrogates.py and tests/bofire/data_models/specs/priors.py. I have left the LKJ prior in the code, but currently it cannot be used: when used it defaults to None and throws a warning, let me know if this functionality is okay or should be changed.

@jduerholt
Copy link
Contributor

Thx, will have a look over the course of the week.

@jduerholt jduerholt mentioned this pull request Mar 14, 2024
Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

thank you very much, looks overall really good. Only some minor things due to tests and validators. If you have problems with the validators, just tell me, then I try to resolve them. I know they can be tricky ...

As soon as this is landed, we have to implement it to the optimizer ;)

Best,

Johannes

bofire/data_models/features/tasks.py Outdated Show resolved Hide resolved
bofire/data_models/surrogates/multi_task_gp.py Outdated Show resolved Hide resolved
bofire/benchmarks/single.py Show resolved Hide resolved
bofire/data_models/features/tasks.py Outdated Show resolved Hide resolved
bofire/priors/mapper.py Show resolved Hide resolved
tests/bofire/surrogates/test_gps.py Outdated Show resolved Hide resolved
tests/bofire/surrogates/test_gps.py Outdated Show resolved Hide resolved
input_preprocessing_specs={"task_id": CategoricalEncodingEnum.ONE_HOT},
)

# test that if there is no task input, there is an error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can also go the add_invalid

tests/bofire/surrogates/test_gps.py Outdated Show resolved Hide resolved
tests/bofire/surrogates/test_gps.py Outdated Show resolved Hide resolved
@jpfolch
Copy link
Contributor Author

jpfolch commented Apr 3, 2024

I've made all the requested changes. All tests pass locally :).

@jduerholt
Copy link
Contributor

Hi @jpfolch,

thank you very much, can you also resolve the merge conflicts by merging the main branch into this one?

Best,

Johannes

@jpfolch
Copy link
Contributor Author

jpfolch commented Apr 8, 2024

All done, conflicts should be resolved now. I tested, and all tests passed except the EntingStrategy spec, but I don't have entmoot installed which explains it.

@jduerholt jduerholt marked this pull request as ready for review April 9, 2024 19:23
@jduerholt
Copy link
Contributor

jduerholt commented Apr 9, 2024

Tests are running through ;), can you fix the linting errors to?

@jpfolch
Copy link
Contributor Author

jpfolch commented Apr 10, 2024

I think the linting errors should be fixed now

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jpfolch,

thank you very much! And sorry for the long process, as I am currently in parental leave, I am not looking every day into github ;)

Next step would be the multifidelity one, or the incorporation into the strategy?

Best,

Johannes

@jduerholt jduerholt merged commit 612db94 into experimental-design:main Apr 12, 2024
10 checks passed
@jpfolch
Copy link
Contributor Author

jpfolch commented Apr 25, 2024

Hi @jduerholt,

Thank you very much! I've also been slow as I have been away and juggling other things meanwhile. Happy to see it has been merged.

The MultiTaskGP can be used directly to model the fidelities, so I think the next step would be incorporation into the strategy? This is, assuming the strategies are the part of the code where the BO chooses which experiment to do next?

Best,
Jose

@jduerholt
Copy link
Contributor

Hi @jpfolch,

now I am back from parental leave and have again more time for BoFire.

Yes, incorporation into the strategies would be the next step. The adaptations has to be made here: https://github.com/experimental-design/bofire/blob/main/bofire/strategies/predictives/botorch.py

Especially, we have to adjust the dealing with the fixed features, as we currently assume that categoricals are always one hot encoded. Honestly, this whole part of the code needs to be cleaned up. Are you interested in doing it, or should I make a first draft?

Best,

Johannes

@jduerholt
Copy link
Contributor

jduerholt commented May 17, 2024

It would be especially important to know how the optimization over the TaskInput should work, do we optimize the ACQF for all allowed tasks and return the candidate with the highest acqf value? Or is somehow also the evaluation cost is considered, or is only the highest fidelity queried? Do we need specific ACQFS for this case?

@jpfolch
Copy link
Contributor Author

jpfolch commented Jun 7, 2024

Hi @jduerholt,

Sorry for the delay in reply, I have been busy finishing up work my PhD. I am currently writing up, but I should have some time to work on BoFire. So based on my previous work the way I recommend going about optimizing over TaskInput is:

  1. We first choose the next point of interest by returning the candidate with the highest acqf value at the target fidelity.

  2. We then choose which fidelity to evaluate based on one of two criteria:
    (a) If the fidelity criterion is strictly ordered, we can simply choose the lowest fidelity until we reduce the uncertainty enough. We would then choose the second lowest fidelity until we reduce uncertainty, and so on... the idea being that we want to use lowest possible fidelity which we believe might still be informative of our target. This criterion is cheap to compute, generally stable, and I found it to work well. However, it does not take cost into account.
    (b) A second criterion, which is more general and does take cost into account, is to consider the information gain of each fidelity on the target fidelity per unit cost. This is a more general criteria, however, it can be expensive to compute accurately and numerically unstable. There is a BoTorch implementation of the method though, which is probably well optimized (the single-fidelity method is described in this tutorial but there is a multi-fidelity version in the source code).

Approach (b) would be equivalent to first optimizing any acquisition function in the target fidelity, and then optimizing the task-input only with the MF-MES acquisition function. The benefit of this is that it gives you more flexibility in the choice acquisition functions and it is computationally cheaper than optimizing MF-MES directly (which is another possible route).

Let me know which of the two options you would prefer, and I can get started on it.

Best,
Jose

@jpfolch
Copy link
Contributor Author

jpfolch commented Jun 11, 2024

@bertiqwerty

@jduerholt
Copy link
Contributor

Hi @jpfolch,

totally forgot this. I will have a detailed look tomorrow. Sorry. Too much to do :(

Best,

Johannes

@jduerholt
Copy link
Contributor

Hi @jpfolch,

soory for the delay, the last weeks were crazy. I am not so much into multifidelity optimization, so I have a few questions:

In general, I am always for starting with the easiest option ;)

Best,

Johannes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants