Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Domains docu #369

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Domains docu #369

wants to merge 19 commits into from

Conversation

Osburg
Copy link
Collaborator

@Osburg Osburg commented Mar 11, 2024

No description provided.

@Osburg
Copy link
Collaborator Author

Osburg commented Mar 11, 2024

@Osburg
Copy link
Collaborator Author

Osburg commented Mar 11, 2024

NonlinearInequalityConstraint(expression="x12 + x22 - x3", features=["x1","x2","x3"])

@R-M-Lee
Copy link
Contributor

R-M-Lee commented Mar 13, 2024

Hi @KappatC and @Osburg, is this ready for review? I am happy to be the reviewer when needed. Otherwise we should convert to draft

@KappatC
Copy link
Collaborator

KappatC commented Mar 13, 2024

Imo there are a few things to ask Johannes, but apart from this should be ok. @Osburg what do you say?

Hi @KappatC and @Osburg, is this ready for review? I am happy to be the reviewer when needed. Otherwise we should convert to draft

Imo there are a few things to ask @jduerholt, but apart from this should be ok. @Osburg what do you say?

@R-M-Lee
Copy link
Contributor

R-M-Lee commented Mar 13, 2024

we can ask questions here... Johannes will have been pinged when you mentioned him just now anyway

@jduerholt
Copy link
Contributor

Just shoot your question ;)

@Osburg
Copy link
Collaborator Author

Osburg commented Mar 14, 2024

Hi @jduerholt :) Yes, we still had a few questions:

  • I think both of us never used descriptor inputs (as in ContinuousDescriptorInput and CategoricalDescriptorInput). What are these for? Or can you give us a reference to an explanatation?
  • What is the purpose of TaskInputs?
  • We've seen that CloseToTargetObjectives seem to be suitable for multiobjective strategies, while TargetObjectives are not. What is the difference between them (apart from their different implementations of __call__())?

@KappatC Did I forget anything?
@jduerholt if it is easier for you to just complete the missing parts yourself, this is fine for me as well. But an explanation is appreciated so that we know better in the future.

Cheers
Aaron

@KappatC
Copy link
Collaborator

KappatC commented Mar 14, 2024

Hi @jduerholt :) Yes, we still had a few questions:

* I think both of us never used descriptor inputs (as in `ContinuousDescriptorInput` and `CategoricalDescriptorInput`). What are these for? Or can you give us a reference to an explanatation?

* What is the purpose of `TaskInputs`?

* We've seen that `CloseToTargetObjective`s seem to be suitable for multiobjective strategies, while `TargetObjective`s are not. What is the difference between them (apart from their different implementations of `__call__()`)?

@KappatC Did I forget anything? @jduerholt if it is easier for you to just complete the missing parts yourself, this is fine for me as well. But an explanation is appreciated so that we know better in the future.

Cheers Aaron

Thanks @Osburg for summarizing. Yes that should be it, they are also marked as todos in the file (they are the only ones except from adding some links to the rest of the docu once we have everything). Maybe one more general thing @jduerholt is double checking that the list of inputs/objectives is complete and that they are all ready to be used :)

@jduerholt
Copy link
Contributor

Hi @KappatC and @Osburg,

regarding your questions:

  • CategoricalDescriptorInput: Imagine having a categorial input with for example 10 different categories and let's say that every category corresponds to a specific material. Via the CategoricalDescriptorInput one can provide it with continuous encodings for the different categories via so called descriptors. In our example with the ten different materials, the descriptors could be for example density and hardness. Every material/category would get assigned a number for density and hardness in the hope that these two properties describe the material properly. In the context of fitting a GP, one can then use just these two dimensional vector for describing the material instead of a ten dimensional one-hot encoding, which results in a dimensionality reduction. Of course, this makes only sense of the descriptors actually correlate with the desired quantities.
  • ContinuousDescriptorInput: Ignore it, is used nowhere and I am still not sure if we will ever use it. Maybe we should also just remove it and add it again when it is really used to not confuse people. What do you think?
  • TaskInput: Should be used for MultiTaskGPs and MultiFidelityGPs. Currently under implementation here: Initial attempt to incorporate MultiTask GPs #353. You can also find more about why it is implemented as it is implemented in this PR.
  • CloseToTargetObjective and TargetObjective: CloseToTargetObjective actually measures the difference to the target value which is something which makes sense to minimize in a true multiobjective optimization and to include in the pareto front, whereas TargetObjective is of type ConstrainedObjective as MaximizeSigmoid or Minimize, so it get 1 if the value is in the target region and falls asymptorically agains zero outside the target region.
  • Note that also the objectives of type ConstrainedObjective can be used in multiobjective optimization but you need at least two targets of type Minimize, Maximize or CloseToTarget.

I hope this helps! If you need more, just ask again. If ok, for you, I would prefert that you finish it and I review/add/modify in the end. Ok for you?

Best,

Johannes

@KappatC
Copy link
Collaborator

KappatC commented Mar 25, 2024

Hi @KappatC and @Osburg,

regarding your questions:

* `CategoricalDescriptorInput`: Imagine having a categorial input with for example 10 different categories and let's say that every category corresponds to a specific material. Via the `CategoricalDescriptorInput` one can provide it with continuous encodings for the different categories via so called descriptors. In our example with the ten different materials, the descriptors could be for example `density` and `hardness`. Every material/category would get assigned a number for `density` and `hardness` in the hope that these two properties describe the material properly. In the context of fitting a GP, one can then use just these two dimensional vector for describing the material instead of a ten dimensional one-hot encoding, which results in a dimensionality reduction. Of course, this makes only sense of the descriptors actually correlate with the desired quantities.

* `ContinuousDescriptorInput`: Ignore it, is used nowhere and I am still not sure if we will ever use it. Maybe we should also just remove it and add it again when it is really used to not confuse people. What do you think?

* `TaskInput`: Should be used for `MultiTaskGP`s and `MultiFidelityGP`s. Currently under implementation here: [Initial attempt to incorporate MultiTask GPs #353](https://github.com/experimental-design/bofire/pull/353). You can also find more about why it is implemented as it is implemented in this PR.

* `CloseToTargetObjective` and `TargetObjective`: `CloseToTargetObjective` actually measures the difference to the target value which is something which makes sense to minimize in a true multiobjective optimization and to include in the pareto front, whereas `TargetObjective` is of type `ConstrainedObjective` as `MaximizeSigmoid` or `Minimize`, so it get 1 if the value is in the target region and falls asymptorically agains zero outside the target region.

* Note that also the objectives of type `ConstrainedObjective` can be used in multiobjective optimization but you need at least two targets of type `Minimize`, `Maximize` or `CloseToTarget`.

I hope this helps! If you need more, just ask again. If ok, for you, I would prefert that you finish it and I review/add/modify in the end. Ok for you?

Best,

Johannes

Hey @jduerholt, thanks for the explanations. I tried to adjust the text accordignly, but feel free to make any further changes. A few remarks/todo's left:

  1. A link to the strategy docu is missing. I kept it as a todo in the text cause I am unsure of the status there.
  2. I read the thread with Jose's implementation/comments for the TaskInputs. Not sure if this is sth used atm. There is another todo in the text at that point, I am not feeling confident explaining this, if you could do it, would be great if not I d simply leave it out for now.
  3. Could you please double check that the example with CategoricalDescriptorInput is correct?
  4. I think I now understand the difference in the implementation CloseToTarget Objective and TargetObjective, so thanks for explaining. Tbh, conceptually I am still sceptical about whether we need to differentiate between the two, but that’s probably another topic. I tried to keep the text close to your explanation, but feel free to change things whenever you think is appropriate.

I hope i did not miss anything. Apart from this the rest imo should be good to go :)

Best,
Chryssa

@jduerholt
Copy link
Contributor

Hi @KappatC,

  1. just leave the strategies doc as todo, I will take care for it at some point.
  2. just leave it out for now, we will add it when we really start using it.
  3. looks good to me
  4. looks also good to me

If you wonder why the tests are failing, we are testing the code snippets in the documentation, it seems that some snippets are ill formatted or buggy. If you have substantial problems there, just tell me, then I have a look.

Best,

Johannes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants