Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpoint Constraints #313

Merged
merged 10 commits into from
Nov 30, 2023
Merged

Interpoint Constraints #313

merged 10 commits into from
Nov 30, 2023

Conversation

jduerholt
Copy link
Contributor

As discussed here, this adds the first interpoint constraint data model to BoFire.

From my perspective, a call method on interpoint constraints is somehow ill defined and not needed. What do you think about the jacobian method?

The integration in the Constraints data model is not yet complete, as the return value of the is_fulfilled method is now a bool and not a series of bools as in the case of the intrapoints. Will fix this later.

@Osburg @dlinzner-bcs @KappatC

@Osburg
Copy link
Collaborator

Osburg commented Nov 26, 2023

I think without a call method we also do not need a jacobian, right? :D but a call method for BatchConstraints / Interpoint constraints would not be less well-defined than for the NChooseK constraint (and there we also have a call method). In fact I think there is a relatively natural way to define a call and jacobian method for batch constraints (have not thought about other types of interpoint constraints): For each batch we evaluate all batch_size - 1 linear constraints assigned to the batch. Per batch we compute some norm (or other suitable function) of these linear constraint evaluations. The call function then returns this number for each batch. The jacobian could be defined as the derivative of the function i just described with respect to each variable of each experiment which is contained in the batch.
An implementation (jacobian still missing) could look like this (does not yet respect changes from this pr, wrote this piece of code down before you opened the pr, sry):

class BatchConstraint(Constraint):
    """Batch constraint which is fulfilled iff batch_size subsequent
    experiments have the same value for all features specified in features.

    Attributes:
        feature : feature key (str) the constraint operates on.
        batch_size (int): size of the batch
    """

    type: Literal["BatchConstraint"] = "BatchConstraint"

    feature: str
    batch_size: int


    @root_validator(pre=False, skip_on_failure=True)
    def validate_batch_size(cls, values):
        """Validate that batch_size is larger than 1."""
        if values["batch_size"] < 2:
            raise ValueError("batch_size must be larger than 1")
        return values

    def is_fulfilled(
        self, experiments: pd.DataFrame, tol: float = 1e-6
    ) -> pd.Series:
        
        values = self(experiments)
        return pd.Series(
            np.isclose(values, 0, atol=tol), index=values.index
        )

    def __call__(self, experiments: pd.DataFrame) -> pd.Series:
        """Numerically evaluates the constraint. Returns the distance to the constraint fulfillment
        for each batch of size batch_size.

        Args:
            experiments (pd.DataFrame): Dataframe to evaluate the constraint on.

        Returns:
            pd.Series: Distance to reach constraint fulfillment.
        """
        n_batches = (experiments.shape[0] + self.batch_size - 1) // self.batch_size
        feature_values = np.zeros(n_batches * self.batch_size)
        feature_values[:experiments.shape[0]] = experiments[self.feature].values
        feature_values[experiments.shape[0]:] = feature_values[-self.batch_size]
        feature_values = feature_values.reshape(n_batches, self.batch_size).T

        batchwise_constraint_matrix = np.zeros(shape=(self.batch_size-1, self.batch_size))
        batchwise_constraint_matrix[:,0] = 1.
        batchwise_constraint_matrix[:,1:] = -np.eye(self.batch_size-1)

        return pd.Series(np.linalg.norm(batchwise_constraint_matrix @ feature_values, axis=0, ord=2)**2, index=[f"batch_{i}" for i in range(n_batches)])
 

On the other hand, if we don't really need it, we can also just drop it ... :D
What do you think @jduerholt @dlinzner-bcs @KappatC ?

@jduerholt
Copy link
Contributor Author

@Osburg : thanks, I do not think that we acutally need it, but as you have it here already, let us add it or? I would finalize this PR here without the implmentation and just raising NotImplementedError for __call__ and jacobian, and then you add your implementation in a seperate PR, ok for you?

@Osburg
Copy link
Collaborator

Osburg commented Nov 27, 2023

Sure, this is fine! :)

@jduerholt
Copy link
Contributor Author

@dlinzner-bcs: should be fine now from my side and ready for review.

@jduerholt
Copy link
Contributor Author

@dlinzner-bcs @KappatC : Polytope sampler should now also work, I just need to fix some edge cases ;) I tell you once it is finished ...

@jduerholt
Copy link
Contributor Author

I just close this PR and reopen as there are some github problems ...

@jduerholt jduerholt closed this Nov 27, 2023
@jduerholt jduerholt reopened this Nov 27, 2023
Copy link
Contributor

@dlinzner-bcs dlinzner-bcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise. Excuse my naive questions :).

bofire/data_models/constraints/interpoint.py Show resolved Hide resolved
tests/bofire/utils/test_torch_tools.py Show resolved Hide resolved
Copy link
Collaborator

@KappatC KappatC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code. Looks pretty good to me. Here some minor understanding questions.

bofire/data_models/constraints/constraint.py Show resolved Hide resolved
bofire/utils/torch_tools.py Show resolved Hide resolved
tests/bofire/data_models/specs/constraints.py Outdated Show resolved Hide resolved
tests/bofire/utils/test_torch_tools.py Show resolved Hide resolved
@jduerholt jduerholt merged commit cb12051 into main Nov 30, 2023
10 checks passed
@jduerholt jduerholt deleted the feature/interpoint_constraint branch November 30, 2023 13:13
all_indices = torch.arange(
i * multiplicity, min((i + 1) * multiplicity, n_candidates)
)
for k in range(len(all_indices) - 1):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, could you please explain to me why we need a pair of indices each time and not assign the feat_idx to each k entry, with k in range(len(all_indices))? thanks in advance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I am not sure that I understand you correct. We always need a pair of indices as we have several equalities of the form feat_in_row_0 - feat-in_col_k == 0. This makes two indices. And each index is by itself a pair because we have to index both the correct row and column index.

Is this helping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants