Skip to content

Commit

Permalink
Merge pull request #407 from pybop-team/382b-minkowski-cost
Browse files Browse the repository at this point in the history
Including exponent in Minkowski cost
  • Loading branch information
BradyPlanden authored Jul 15, 2024
2 parents 64312ae + 73f5477 commit 5491702
Show file tree
Hide file tree
Showing 5 changed files with 86 additions and 34 deletions.
58 changes: 42 additions & 16 deletions examples/notebooks/comparing_cost_functions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"## Investigating different cost functions\n",
"\n",
"In this notebook, we take a look at the different cost function offered in PyBOP. Cost functions conventionally construct a distance metric between two mathematics sets (vectors), which is then used within PyBOP's optimisation algorthims. \n",
"In this notebook, we take a look at the different fitting cost functions offered in PyBOP. Cost functions for fitting problems conventionally describe the distance between two points (the target and the prediction) which is to be minimised via PyBOP's optimisation algorithms. \n",
"\n",
"First, we install and import the required packages below."
]
Expand Down Expand Up @@ -62,7 +62,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For this notebook, we need to construct parameters, a model and a problem class before we can compare differing cost functions. We start with two parameters, but this is an arbituary selection and can be expanded given the model and data in question."
"For this notebook, we need to construct parameters, a model and a problem class before we can compare differing cost functions. We start with two parameters, but this is an arbitrary selection and can be expanded given the model and data in question."
]
},
{
Expand All @@ -89,7 +89,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we will construct the Single Particle Model (SPM) with the Chen2020 parameter set, but like the above, this is an arbitruary selection and can be replaced with any PyBOP model."
"Next, we will construct the Single Particle Model (SPM) with the Chen2020 parameter set, but like the above, this is an arbitrary selection and can be replaced with any PyBOP model."
]
},
{
Expand Down Expand Up @@ -163,7 +163,7 @@
"source": [
"### Sum of Square Errors and Root Mean Squared Error\n",
"\n",
"First, let's start with the easiest cost functions, the sum of squared errors (SSE) and the root mean squared error (RMSE). Constructing these classes is very concise in PyBOP, and only requires the problem class."
"First, let's start with two commonly-used cost functions: the sum of squared errors (SSE) and the root mean squared error (RMSE). Constructing these classes is very concise in PyBOP, and only requires the problem class."
]
},
{
Expand All @@ -180,7 +180,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can investigate how these functions differ when fitting the parameters. To acquire the distance metric for each of these, we can simply use the constructed class in a call method, such as:"
"Now, we can investigate how these functions differ when fitting the parameters. To acquire the cost value for each of these, we can simply use the call method of the constructed class, such as:"
]
},
{
Expand Down Expand Up @@ -316,7 +316,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this situation, it's clear that gradient of the SSE cost is much higher than the RMSE. This can be helpful for certain optimisation algorithms, specifically towards improving convergence performance within a predefine number of iterations. However, with incorrect hyperparameter values this can also result in the algorithm not converging due to sampling locations outside of the \"cost valley\"."
"In this situation, it's clear that the curvature of the SSE cost is greater than that of the RMSE. This can improve the rate of convergence for certain optimisation algorithms. However, with incorrect hyperparameter values, larger gradients can also result in the algorithm not converging due to sampling locations outside of the \"cost valley\", e.g. infeasible parameter values."
]
},
{
Expand All @@ -325,14 +325,19 @@
"source": [
"### Minkowski Cost\n",
"\n",
"Next, we will investigate using the Minkowski cost function. This cost function provides a general formation of the above two cost functions, allowing for hyper parameter calibration on the cost function itself. The Minkowski cost takes the form,\n",
"Next, we will investigate using the Minkowski distance. The Minkowski cost takes a general form, which allows for hyperparameter calibration on the cost function itself, given by\n",
"\n",
"$\\mathcal{L} = \\displaystyle\\sum_{1}^N (\\hat{y}-y)^p$\n",
"$\\mathcal{L_p} = \\displaystyle \\Big(\\sum_i |\\hat{y_i}-y_i|^p\\Big)^{1/p}$\n",
"\n",
"For p = 1, this becomes L1Norm \n",
"For p = 2, this becomes L2Norm (SSE)\n",
"where p ≥ 0 is the order of the Minkowski distance.\n",
"\n",
"PyBOP offers a Minkowski class, which we will construct below. This class has an optional argument of `p` which designates the order in the above equation. This value can be a float, with the only requirement that it is not negative. First, let's reconstruct the SSE function with a `p` value of 2."
"For $p = 1$, it is the Manhattan distance.\n",
"For $p = 2$, it is the Euclidean distance.\n",
"For $p ≥ 1$, the Minkowski distance is a metric, but for $0<p<1$, note that the Minkowski distance is not a metric because the triangle inequality does not hold [[1]](https://en.wikipedia.org/wiki/Minkowski_distance), [[2]](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.minkowski.html#scipy.spatial.distance.minkowski).\n",
"\n",
"The RMSE can be computed by dividing the Euclidean distance by the square root of the length of the target vector, while the SSE is the square of the Euclidean distance.\n",
"\n",
"PyBOP offers a Minkowski class, which we will construct below. This class has an optional argument of `p` which designates the order in the above equation. This value can be a float, with the only requirement that it is not negative. First, let's confirm the relationship between the SSE, RMSE and the Minkowski distance with a `p` value of 2."
]
},
{
Expand Down Expand Up @@ -365,16 +370,36 @@
" y_minkowski.append(cost_minkowski([7.56e-05, i]))\n",
"\n",
"fig = go.Figure()\n",
"fig.add_trace(go.Scatter(x=x_range, y=y_SSE, mode=\"lines\", name=\"SSE\"))\n",
"fig.add_trace(go.Scatter(x=x_range, y=y_minkowski, mode=\"lines\", name=\"Minkowski\"))\n",
"fig.add_trace(\n",
" go.Scatter(\n",
" x=x_range,\n",
" y=np.asarray(y_RMSE) * np.sqrt(len(t_eval)),\n",
" mode=\"lines\",\n",
" name=\"RMSE*N\",\n",
" )\n",
")\n",
"fig.add_trace(\n",
" go.Scatter(\n",
" x=x_range,\n",
" y=np.sqrt(y_SSE),\n",
" mode=\"lines\",\n",
" line=dict(dash=\"dash\"),\n",
" name=\"sqrt(SSE)\",\n",
" )\n",
")\n",
"fig.add_trace(\n",
" go.Scatter(\n",
" x=x_range, y=y_minkowski, mode=\"lines\", line=dict(dash=\"dot\"), name=\"Minkowski\"\n",
" )\n",
")\n",
"fig.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As expected, these cost functions are equivalent. Now, let take a look at how the Minkowski cost changes for different orders of `p`."
"As expected, these lines lie on top of one another. Now, let's take a look at how the Minkowski cost changes for different orders, `p`."
]
},
{
Expand Down Expand Up @@ -414,16 +439,17 @@
" fig.add_trace(\n",
" go.Scatter(x=x_range, y=y_minkowski[k], mode=\"lines\", name=f\"Minkowski {_}\")\n",
" )\n",
"fig.update_yaxes(range=[0, np.max(y_minkowski[2])])\n",
"fig.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As seen above, the Minkowski cost function allows for a variety of different distance metrics to be created. This provides users with another hyper parameter for to calibrate for optimisation algorithm convergence. This addition does expand the global search space, and should be carefully considered before deciding upon.\n",
"As seen above, the Minkowski cost allows for a range of different cost functions to be created. This provides users with another hyperparameter to calibrate for optimisation algorithm convergence. This addition does expand the global search space, and should be carefully considered before deciding upon.\n",
"\n",
"In this notebook, we've shown the different distance metrics (cost functions) offered in PyBOP. Selection between these functions can effect the identified parameters in the case that the optimiser hyperparameter values are not properly calibrated. "
"In this notebook, we've shown the different fitting cost functions offered in PyBOP. Selection between these functions can affect the optimisation result in the case that the optimiser hyperparameter values are not properly calibrated. "
]
}
],
Expand Down
42 changes: 31 additions & 11 deletions pybop/costs/fitting_costs.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,34 +172,43 @@ def _evaluateS1(self, inputs: Inputs):
class Minkowski(BaseCost):
"""
The Minkowski distance is a generalisation of several distance metrics,
including Euclidean and Manhattan distances. It is defined as:
including the Euclidean and Manhattan distances. It is defined as:
.. math::
L_p(x, y) = (\\sum_i |x_i - y_i|^p)
L_p(x, y) = ( \\sum_i |x_i - y_i|^p )^(1/p)
where p ≥ 1 is the order of the Minkowski metric.
where p > 0 is the order of the Minkowski distance. For p ≥ 1, the
Minkowski distance is a metric. For 0 < p < 1, it is not a metric, as it
does not satisfy the triangle inequality, although a metric can be
obtained by removing the (1/p) exponent.
Special cases:
* p = 1: Manhattan distance
* p = 2: Euclidean distance
* p → ∞: Chebyshev distance
* p → ∞: Chebyshev distance (not implemented as yet)
This class implements the Minkowski distance as a cost function for
optimisation problems, allowing for flexible distance-based optimisation
across various problem domains.
Attributes:
p (float): The order of the Minkowski metric.
Attributes
----------
p : float, optional
The order of the Minkowski distance.
"""

def __init__(self, problem, p: float = 2.0):
super().__init__(problem)
if p < 0:
raise ValueError(
"The order of the Minkowski metric must be greater than 0."
"The order of the Minkowski distance must be greater than 0."
)
self.p = p
elif not np.isfinite(p):
raise ValueError(
"For p = infinity, an implementation of the Chebyshev distance is required."
)
self.p = float(p)

def _evaluate(self, inputs: Inputs, grad=None):
"""
Expand All @@ -222,6 +231,7 @@ def _evaluate(self, inputs: Inputs, grad=None):
e = np.asarray(
[
np.sum(np.abs(prediction[signal] - self._target[signal]) ** self.p)
** (1 / self.p)
for signal in self.signal
]
)
Expand Down Expand Up @@ -253,10 +263,20 @@ def _evaluateS1(self, inputs):
return np.inf, self._de * np.ones(self.n_parameters)

r = np.asarray([y[signal] - self._target[signal] for signal in self.signal])
e = np.sum(np.sum(np.abs(r) ** self.p))
de = self.p * np.sum(np.sum(r ** (self.p - 1) * dy.T, axis=2), axis=1)
e = np.asarray(
[
np.sum(np.abs(y[signal] - self._target[signal]) ** self.p)
** (1 / self.p)
for signal in self.signal
]
)
de = np.sum(
np.sum(r ** (self.p - 1) * dy.T, axis=2)
/ (e ** (self.p - 1) + np.finfo(float).eps),
axis=1,
)

return e, de
return np.sum(e), de


class SumofPower(BaseCost):
Expand Down
10 changes: 5 additions & 5 deletions pybop/parameters/parameter.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def __init__(
self.set_bounds(bounds)
self.margin = 1e-4

def rvs(self, n_samples, random_state=None):
def rvs(self, n_samples: int = 1, random_state=None):
"""
Draw random samples from the parameter's prior distribution.
Expand All @@ -61,7 +61,7 @@ def rvs(self, n_samples, random_state=None):
Parameters
----------
n_samples : int
The number of samples to draw.
The number of samples to draw (default: 1).
Returns
-------
Expand Down Expand Up @@ -332,7 +332,7 @@ def rvs(self, n_samples: int = 1) -> list:
Parameters
----------
n_samples : int
The number of samples to draw.
The number of samples to draw (default: 1).
Returns
-------
Expand Down Expand Up @@ -417,7 +417,7 @@ def get_bounds_for_plotly(self):
bounds : numpy.ndarray
An array of shape (n_parameters, 2) containing the bounds for each parameter.
"""
bounds = np.empty((len(self), 2))
bounds = np.zeros((len(self), 2))

for i, param in enumerate(self.param.values()):
if param.applied_prior_bounds:
Expand All @@ -427,7 +427,7 @@ def get_bounds_for_plotly(self):
UserWarning,
stacklevel=2,
)
elif param.bounds is not None:
if param.bounds is not None:
bounds[i] = param.bounds
else:
raise ValueError("All parameters require bounds for plotting.")
Expand Down
3 changes: 2 additions & 1 deletion pybop/plotting/plot2d.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import sys
import warnings
from typing import Union

import numpy as np
from scipy.interpolate import griddata
Expand All @@ -10,7 +11,7 @@
def plot2d(
cost_or_optim,
gradient: bool = False,
bounds: np.ndarray = None,
bounds: Union[np.ndarray, None] = None,
steps: int = 10,
show: bool = True,
use_optim_log: bool = False,
Expand Down
7 changes: 6 additions & 1 deletion tests/unit/test_cost.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,8 +231,13 @@ def test_costs(self, cost):
@pytest.mark.unit
def test_minkowski(self, problem):
# Incorrect order
with pytest.raises(ValueError, match="The order of the Minkowski metric"):
with pytest.raises(ValueError, match="The order of the Minkowski distance"):
pybop.Minkowski(problem, p=-1)
with pytest.raises(
ValueError,
match="For p = infinity, an implementation of the Chebyshev distance is required.",
):
pybop.Minkowski(problem, p=np.inf)

@pytest.mark.unit
def test_sumofpower(self, problem):
Expand Down

0 comments on commit 5491702

Please sign in to comment.