added calibration metric #253

swagataroy123 · 2023-08-07T12:55:19Z

@jduerholt ,
Added calibration metric from https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py

jduerholt

Thanks, I left some comments.

jduerholt · 2023-08-07T18:26:38Z

bofire/surrogates/diagnostics.py

+    """Calculates the Spearman correlation coefficient between the models absolute error
+    and the uncertainty - non-linear correlation.
+
+    This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py


Don't know. Copied from the fisher format. Will just remove it

jduerholt · 2023-08-07T18:28:27Z

bofire/surrogates/diagnostics.py

+    """Calculates the Pearson correlation coefficient between the models absolute error
+    and the uncertainty - linear correlation.
+
+    This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py


jduerholt · 2023-08-07T18:28:38Z

bofire/surrogates/diagnostics.py

+    """Calculates the Kendall correlation coefficient between the models absolute error
+    and the uncertainty - linear correlation.
+
+    This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py


jduerholt · 2023-08-07T18:28:49Z

bofire/surrogates/diagnostics.py

+    standard_deviation: Optional[np.ndarray] = None,
+) -> float:
+    """Calculates the Negative log-likelihood of predicted data
+    This implementation is taken from Ax: https://github.com/aspuru-guzik-group/dionysus/blob/main/dionysus/uncertainty_metrics.py


jduerholt · 2023-08-07T18:33:23Z

tests/bofire/surrogates/test_diagnostics.py

@@ -419,7 +431,12 @@ def test_CvResults2CrossValidationValues(cv_results):
        else:
            assert transformed["a"][i].standardDeviation is None
        for m in metrics.columns:
-            assert metrics.loc[i, m] == transformed["a"][i].metrics[m]
+            if cv_results.results[i].standard_deviation is not None:


otherwise i get assert nan = nan error

jduerholt · 2023-08-07T18:37:48Z

tests/bofire/surrogates/test_diagnostics.py

+    cv = generate_cvresult(key="a", n_samples=10, include_standard_deviation=False)
+    assert cv.n_samples == 10
+    np.isnan(cv.get_metric(RegressionMetricsEnum.CALIBRATION))
+


Can you please also add tests for all the added new metrices starting with an _? Maybe they also had test cases for them in the original implementation.

Please also check then that the warning is raised and the nan returned when no uncertainty is provided for all _ methods that you added.

Sould i make a seperate dictionary for UQ_metrics like metrics

jduerholt · 2023-08-07T18:42:23Z

bofire/surrogates/diagnostics.py

+        np.ndarray: callibration score for each quantile.
+    """
+    if standard_deviation is None:
+        warnings.warn("Calibration metric without standard deviation is not possible")


raise a value error here if no standard deviation is provided, and in the drived metrics then catch this error with a try except and raise a warning and return the nan.

jduerholt · 2023-08-07T18:47:50Z

bofire/surrogates/diagnostics.py

+            1.0 / (2.0 * observed.shape[0])
+            + observed.shape[0] * np.log(2 * np.pi)
+            + np.sum(np.log(predicted))
+            + np.sum(np.square(predicted - observed) / standard_deviation)  # type: ignore


are you sure this is correct? It should be standard_deviation to the power of two. This looks a bit wrong to me.

But it could also be that I overlook something ;)

I just copied it. maybe they missed a bracket. I will double check and correct accordingly

i think u r right

jduerholt

You are right, we should have seperate dictionaries and enums. Thanks for the hint.

jduerholt · 2023-08-10T13:33:43Z

bofire/surrogates/feature_importance.py

@@ -40,6 +40,13 @@ def permutation_importance(
        RegressionMetricsEnum.MSD: -1.0,
        RegressionMetricsEnum.PEARSON: 1.0,
        RegressionMetricsEnum.SPEARMAN: 1.0,
+        RegressionMetricsEnum.PEARSON_UQ: 1.0,


Remove them here, it makes no sense for permutation importance. You are right, we need a second enum and second dictionary.

We would then iterate here only over the standard regression metrices.

jduerholt · 2023-08-10T13:34:33Z

bofire/data_models/enum.py

@@ -43,3 +43,10 @@ class RegressionMetricsEnum(Enum):
    PEARSON = "PEARSON"
    SPEARMAN = "SPEARMAN"
    FISHER = "FISHER"
+    PEARSON_UQ = "PEARSON_UQ"


Create a new enum named UQRegressionMetricsEnum and move the UQ Metrics into this,

jduerholt · 2023-08-10T13:34:52Z

bofire/data_models/enum.py

+    MAXIMUMCALIBRATION = "MAXIMUMCALIBRATION"
+    MISCALIBRATIONAREA = "MISCALIBRATIONAREA"
+    ABSOLUTEMISCALIBRATIONAREA = "ABSOLUTEMISCALIBRATIONAREA"
+    NLL = "NLL"


Just remove the complete NLL metric.

jduerholt · 2023-08-10T13:35:27Z

bofire/data_models/surrogates/trainable.py

@@ -19,6 +19,13 @@
    RegressionMetricsEnum.PEARSON: MaximizeObjective,
    RegressionMetricsEnum.SPEARMAN: MaximizeObjective,
    RegressionMetricsEnum.FISHER: MaximizeObjective,
+    RegressionMetricsEnum.PEARSON_UQ: MaximizeObjective,


this has to be then updated with the new type.

jduerholt · 2023-08-10T13:36:04Z

bofire/surrogates/diagnostics.py

@@ -181,6 +453,13 @@ def _fisher_exact_test_p(
    RegressionMetricsEnum.PEARSON: _pearson,
    RegressionMetricsEnum.SPEARMAN: _spearman,
    RegressionMetricsEnum.FISHER: _fisher_exact_test_p,
+    RegressionMetricsEnum.PEARSON_UQ: _pearson_UQ,


we also need to update here, then with UQRegressionMetricsEnum

jduerholt · 2023-08-10T13:38:46Z

tests/bofire/surrogates/test_diagnostics.py

+def test_cvresult_get_UQ_metric_invalid():
+    cv = generate_cvresult(key="a", n_samples=10, include_standard_deviation=False)
+    assert cv.n_samples == 10
+    for metric in metrics.keys():


you have to iteratre here only over the uq metrices, as only there the warning is raised.

jduerholt · 2023-08-11T09:02:05Z

tests/bofire/surrogates/test_diagnostics.py

+    cv = generate_cvresult(key="a", n_samples=10, include_standard_deviation=True)
+    assert cv.n_samples == 10
+    for metric in UQ_metrics.keys():
+        cv.get_UQ_metric(metric=metric)


Suggested change

cv.get_UQ_metric(metric=metric)

m = cv.get_UQ_metric(metric=metric)

assert type(m)==float

bofire/surrogates/diagnostics.py

swagataroy123 added 2 commits August 7, 2023 14:54

added calibration metric

5a22975

checked black

974c3ec

jduerholt requested changes Aug 7, 2023

View reviewed changes

swagataroy123 and others added 2 commits August 9, 2023 15:30

Merge branch 'experimental-design:main' into callibration

ede9ceb

changes

bbdc742

jduerholt requested changes Aug 10, 2023

View reviewed changes

changes 2

ce497dc

jduerholt reviewed Aug 11, 2023

View reviewed changes

changes 3

7515ed5

jduerholt requested changes Aug 11, 2023

View reviewed changes

bofire/surrogates/diagnostics.py Outdated Show resolved Hide resolved

bofire/surrogates/diagnostics.py Outdated Show resolved Hide resolved

jduerholt and others added 4 commits August 11, 2023 12:00

Update bofire/surrogates/diagnostics.py

db16d4a

Update bofire/surrogates/diagnostics.py

e8740a1

Merge branch 'experimental-design:main' into callibration

bdbf38e

Merge branch 'experimental-design:main' into callibration

e0ced4c

jduerholt approved these changes Aug 14, 2023

View reviewed changes

jduerholt merged commit 8eb3fe4 into experimental-design:main Aug 14, 2023
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added calibration metric #253

added calibration metric #253

swagataroy123 commented Aug 7, 2023

jduerholt left a comment

jduerholt Aug 7, 2023

swagataroy123 Aug 8, 2023

jduerholt Aug 7, 2023

jduerholt Aug 7, 2023

jduerholt Aug 7, 2023

jduerholt Aug 7, 2023

swagataroy123 Aug 8, 2023

jduerholt Aug 7, 2023

jduerholt Aug 7, 2023

swagataroy123 Aug 8, 2023

jduerholt Aug 7, 2023

jduerholt Aug 7, 2023

jduerholt Aug 7, 2023

swagataroy123 Aug 8, 2023

swagataroy123 Aug 9, 2023

jduerholt left a comment

jduerholt Aug 10, 2023

jduerholt Aug 10, 2023

jduerholt Aug 10, 2023

jduerholt Aug 10, 2023

jduerholt Aug 10, 2023

jduerholt Aug 10, 2023

jduerholt Aug 10, 2023

jduerholt Aug 11, 2023

	cv.get_UQ_metric(metric=metric)
	m = cv.get_UQ_metric(metric=metric)
	assert type(m)==float

added calibration metric #253

added calibration metric #253

Conversation

swagataroy123 commented Aug 7, 2023

jduerholt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jduerholt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment