Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Adds probability of improvement as an acquisition function #458

Conversation

miguelgondu
Copy link
Contributor

@miguelgondu miguelgondu commented Jul 1, 2024

Type of changes

  • New feature
  • Documentation / docstrings
  • Tests

Checklist

  • I've formatted the new code by running poetry run pre-commit run --all-files --show-diff-on-failure before committing.
  • I've added tests for new code.
  • I've added docstrings for the new code.

Checklist for this PR in particular

  • Adding tests particular to the PI acquisition function.
  • Decide on whether we want to call these UtilityFunctions or AcquisitionFunctions. (We decided to keep them as utility_functions)
  • Decide on whether we want to write a whole new tutorial for these, or just expand the tutorial on BO. (We decided to add to an existing tutorial)
  • Add docstrings.

Description

This PR expands the decision_making module by adding Probability of Improvement as a potential acquisition function. It refactors the tests that are joint to utility functions, and provides a tutorial for how to use Probability of Improvement in practice (pattern-matching from the current tutorial on Thompson Sampling).

Issue Number: N/A

@miguelgondu miguelgondu marked this pull request as draft July 1, 2024 12:39
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into GPJax!

If you have not heard from us in a while, please feel free to ping @gpjax/developers or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on Slack for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@miguelgondu
Copy link
Contributor Author

miguelgondu commented Jul 1, 2024

Hi @gpjax/developers,

I'm close to done with adding Probability of Improvement (PI) as an acquisition function. Before I finish this PR I would like to ask two questions:

  1. Should I create an entire new tutorial for BayesOpt beyond Thompson Sampling? Should I just update the current tutorial with a brief example of how to use PI? I vote for the latter. Currently I'm writing a new tutorial, but there's plenty of overlap with the current one on BO. It's worth noting that the current tutorial doesn't use the built-in functionalities for optimizing the acquisition function.
  2. I'm currently adding PI under the same folder as Thompson Sampling (i.e. utility_functions). I feel a better name for them might be acquisition_functions, but I'm happy either way. Let me know if I should be providing it under a different model than the abstract Utility Function.

@miguelgondu miguelgondu changed the title [Work in progress] Feature: Adds probability of improvement as an acquisition function [WIP] Feature: Adds probability of improvement as an acquisition function Jul 1, 2024
@miguelgondu miguelgondu changed the title [WIP] Feature: Adds probability of improvement as an acquisition function Feature: Adds probability of improvement as an acquisition function Jul 1, 2024
@miguelgondu miguelgondu marked this pull request as ready for review July 1, 2024 19:15
Copy link
Contributor

@Thomas-Christie Thomas-Christie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job - have just left a few minor comments.

"Objective posterior must be a ConjugatePosterior to draw an approximate sample."
)

objective_dataset = datasets[OBJECTIVE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to have something along the lines of

if objective_dataset.X is None or objective_dataset.n == 0:
            raise ValueError("Objective dataset must contain at least one item")

given that we use the objective dataset to find best_y.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

predictive_dist = objective_posterior.predict(x_test, objective_dataset)

# Assuming that the goal is to minimize the objective function
best_y = objective_dataset.y.min()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful to define best_y as the minimum posterior mean value at any of the observed points, which would also handle the case where observations could be noisy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I've updated how we compute best_y. It is the first time I see it being computed like that, though. If I'm not mistaken, other frameworks use the min of the dataset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - for instance this approach is mentioned in "A benchmark of kriging-based infill criteria for noisy
optimization"
, section 3.2.

# Assuming that the goal is to minimize the objective function
best_y = objective_dataset.y.min()

return gaussian_cdf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of implementing our own gaussian_cdf function, IMO it makes sense to use the existing normal distribution provided by Tensorflow probability i.e. normal = tfp.distributions.Normal(predicitve_dist.mean(), predictive_dist.stddev()) and use the cdf defined on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative would be to update GaussianDistribution, right? For now, I'll create a normal as you state.

)


@pytest.mark.filterwarnings(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this will help make testing additional utility functions a lot easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad!

posterior = generate_dummy_conjugate_posterior(dataset)
posteriors = {OBJECTIVE: posterior}
datasets = {OBJECTIVE: dataset}
ts_utility_builder = utility_function_builder(**utility_function_kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Some of the names of variables in this file e.g. ts_utility_builder could do with updating now that these tests are more "utility function agnostic".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved!

@miguelgondu
Copy link
Contributor Author

Hi @Thomas-Christie,

Thanks for the review! I've addressed all the comments. Could I ask you to take a second look? Let me know if anything else comes to mind.

Cheers.

More precisely, given a predictive posterior distribution of the objective
function $`f`$, the probability of improvement at a test point $`x`$ is defined as:
$$`\text{PI}(x) = \text{Prob}[f(x) < f(x_{\text{best}})]`$$
where $`x_{\text{best}}`$ is the minimizer of $`f`$ in the dataset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could update to something like where $x_{\text{best}}$ is the minimiser of the posterior mean at previously observed values, to handle noisy observations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated! Forgot to change this.

assert jnp.isclose(utility_values, expected_utility_values).all()


if __name__ == "__main__":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you forgot to remove this after debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. My bad.

@miguelgondu
Copy link
Contributor Author

miguelgondu commented Jul 5, 2024

All green! 🎉

@thomaspinder
Copy link
Collaborator

Super work @miguelgondu ! Thanks for reviewing @Thomas-Christie

@thomaspinder thomaspinder merged commit fc45b9f into JaxGaussianProcesses:main Jul 6, 2024
11 checks passed
@miguelgondu miguelgondu deleted the feature-adds-probability-of-improvement branch July 6, 2024 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants