Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added calibration metric #253

Merged
merged 10 commits into from
Aug 14, 2023

Conversation

swagataroy123
Copy link
Contributor

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left some comments.

"""Calculates the Spearman correlation coefficient between the models absolute error
and the uncertainty - non-linear correlation.

This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From ax?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know. Copied from the fisher format. Will just remove it

"""Calculates the Pearson correlation coefficient between the models absolute error
and the uncertainty - linear correlation.

This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ax?

"""Calculates the Kendall correlation coefficient between the models absolute error
and the uncertainty - linear correlation.

This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ax?

standard_deviation: Optional[np.ndarray] = None,
) -> float:
"""Calculates the Negative log-likelihood of predicted data
This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ax?

@@ -419,7 +431,12 @@ def test_CvResults2CrossValidationValues(cv_results):
else:
assert transformed["a"][i].standardDeviation is None
for m in metrics.columns:
assert metrics.loc[i, m] == transformed["a"][i].metrics[m]
if cv_results.results[i].standard_deviation is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise i get assert nan = nan error

cv = generate_cvresult(key="a", n_samples=10, include_standard_deviation=False)
assert cv.n_samples == 10
np.isnan(cv.get_metric(RegressionMetricsEnum.CALIBRATION))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add tests for all the added new metrices starting with an _? Maybe they also had test cases for them in the original implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check then that the warning is raised and the nan returned when no uncertainty is provided for all _ methods that you added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sould i make a seperate dictionary for UQ_metrics like metrics

np.ndarray: callibration score for each quantile.
"""
if standard_deviation is None:
warnings.warn("Calibration metric without standard deviation is not possible")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise a value error here if no standard deviation is provided, and in the drived metrics then catch this error with a try except and raise a warning and return the nan.

1.0 / (2.0 * observed.shape[0])
+ observed.shape[0] * np.log(2 * np.pi)
+ np.sum(np.log(predicted))
+ np.sum(np.square(predicted - observed) / standard_deviation) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this is correct? It should be standard_deviation to the power of two. This looks a bit wrong to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it could also be that I overlook something ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied it. maybe they missed a bracket. I will double check and correct accordingly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think u r right

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, we should have seperate dictionaries and enums. Thanks for the hint.

@@ -40,6 +40,13 @@ def permutation_importance(
RegressionMetricsEnum.MSD: -1.0,
RegressionMetricsEnum.PEARSON: 1.0,
RegressionMetricsEnum.SPEARMAN: 1.0,
RegressionMetricsEnum.PEARSON_UQ: 1.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove them here, it makes no sense for permutation importance. You are right, we need a second enum and second dictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would then iterate here only over the standard regression metrices.

@@ -43,3 +43,10 @@ class RegressionMetricsEnum(Enum):
PEARSON = "PEARSON"
SPEARMAN = "SPEARMAN"
FISHER = "FISHER"
PEARSON_UQ = "PEARSON_UQ"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a new enum named UQRegressionMetricsEnum and move the UQ Metrics into this,

MAXIMUMCALIBRATION = "MAXIMUMCALIBRATION"
MISCALIBRATIONAREA = "MISCALIBRATIONAREA"
ABSOLUTEMISCALIBRATIONAREA = "ABSOLUTEMISCALIBRATIONAREA"
NLL = "NLL"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove the complete NLL metric.

@@ -19,6 +19,13 @@
RegressionMetricsEnum.PEARSON: MaximizeObjective,
RegressionMetricsEnum.SPEARMAN: MaximizeObjective,
RegressionMetricsEnum.FISHER: MaximizeObjective,
RegressionMetricsEnum.PEARSON_UQ: MaximizeObjective,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has to be then updated with the new type.

@@ -181,6 +453,13 @@ def _fisher_exact_test_p(
RegressionMetricsEnum.PEARSON: _pearson,
RegressionMetricsEnum.SPEARMAN: _spearman,
RegressionMetricsEnum.FISHER: _fisher_exact_test_p,
RegressionMetricsEnum.PEARSON_UQ: _pearson_UQ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to update here, then with UQRegressionMetricsEnum

def test_cvresult_get_UQ_metric_invalid():
cv = generate_cvresult(key="a", n_samples=10, include_standard_deviation=False)
assert cv.n_samples == 10
for metric in metrics.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to iteratre here only over the uq metrices, as only there the warning is raised.

cv = generate_cvresult(key="a", n_samples=10, include_standard_deviation=True)
assert cv.n_samples == 10
for metric in UQ_metrics.keys():
cv.get_UQ_metric(metric=metric)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cv.get_UQ_metric(metric=metric)
m = cv.get_UQ_metric(metric=metric)
assert type(m)==float

bofire/surrogates/diagnostics.py Outdated Show resolved Hide resolved
bofire/surrogates/diagnostics.py Outdated Show resolved Hide resolved
@jduerholt jduerholt merged commit 8eb3fe4 into experimental-design:main Aug 14, 2023
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants