Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify and clean code style and docstring #68

Merged
merged 35 commits into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
51eceb8
Add reference of modules indices
dachengx Jul 30, 2023
ada2cdc
Update docstring of StatisticalModel
dachengx Jul 30, 2023
274be84
Update docstring of BlueiceExtendedModel
dachengx Jul 30, 2023
8e69318
Update docstring of GaussianModel
dachengx Jul 30, 2023
fe72dc9
Update docstring of Parameters
dachengx Jul 30, 2023
3c6f281
Update docstring of BlueiceExtendedModel
dachengx Jul 30, 2023
ee0da2f
More code style change
dachengx Jul 31, 2023
de83227
Try simplified name
dachengx Jul 31, 2023
2e4c06c
Change it back
dachengx Jul 31, 2023
0237c1f
Add codes to placeholder.ipynb
dachengx Jul 31, 2023
617fa8b
Recover symbolic link
dachengx Jul 31, 2023
e9515c4
Minor change
dachengx Jul 31, 2023
6772eca
Try plot the logpdf
dachengx Jul 31, 2023
d4e0f20
Update odcstring of utils
dachengx Jul 31, 2023
7de6852
Use napoleon and todo as sphinx extension
dachengx Jul 31, 2023
908b49d
Update model.py docstring to Google style
dachengx Jul 31, 2023
7bc1389
Update docstring of Parameters and GaussianModel
dachengx Jul 31, 2023
6c303f0
Update docstring of BlueiceExtendedModel
dachengx Jul 31, 2023
81069f6
Update utils and BlueiceDataGenerator docstring
dachengx Jul 31, 2023
02273ea
Happier code style
dachengx Jul 31, 2023
ca54caa
Happier code style
dachengx Jul 31, 2023
c4fdf7b
Minor change
dachengx Jul 31, 2023
714fb7e
Fix some docstring as suggested
dachengx Jul 31, 2023
f1b33a5
Minor change
dachengx Jul 31, 2023
d834e81
Update docstrings
dachengx Jul 31, 2023
0311397
Minor change
dachengx Jul 31, 2023
90b86c2
Minor change, drop DAR105
dachengx Jul 31, 2023
0588fe9
Minor change
dachengx Jul 31, 2023
7f5657b
Only check code style when PR opened
dachengx Jul 31, 2023
10b8d97
Minor fix
dachengx Aug 1, 2023
4529a7f
Update docstring to tell the relation between public and private method
dachengx Aug 1, 2023
01bb5ed
Remove kwargs from make_objective
dachengx Aug 1, 2023
702353b
Update docstrings of GaussianModel, remove minus argument of Statisti…
dachengx Aug 1, 2023
42e4d7f
Debug
dachengx Aug 1, 2023
2682d9f
Minor change
dachengx Aug 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,6 @@ __pycache__
build
docs/build
docs/source/_build
docs/source/build
debug.py
docs/source/reference/*
docs/source/reference/release_notes.rst
214 changes: 136 additions & 78 deletions alea/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,41 +16,42 @@
class StatisticalModel:
"""
Class that defines a statistical model.
The statisical model contains two parts that you must define yourself:
- a likelihood function, ll(self, parameter_1, parameter_2... parameter_n):
a function of a set of named parameters
returns a float expressing the loglikelihood for observed data
given these parameters
- a data generation method generate_data(self, parameter_1, parameter_2... parameter_n):
a function of the same set of named parameters
returns a full data set:

Methods:
__init__
required to implement:

_ll
_generate_data

optional to implement:
get_expectation_values

Implemented here:
store_data
fit
get_parameter_list

Other members:
_data = None
_config = {}
_confidence_level = 0.9
_confidence_interval_kind = "upper, lower, central"
(if your threshold is the FC threshold, "central" gives you the unified interval)
_confidence_interval_threshold: function that defines the Neyman threshold for limit calculations
_fit_guess = {}
_fixed_parameters = []

- The statisical model contains two parts that you must define yourself:
- a likelihood function
ll(self, parameter_1, parameter_2... parameter_n):
A function of a set of named parameters which
returns a float expressing the loglikelihood for observed data given these parameters.
- a data generation function
generate_data(self, parameter_1, parameter_2... parameter_n)
A function of the same set of named parameters returns a full data set.
- Methods that you must implement:
- _ll
- _generate_data
- Methods that you may implement:
- get_expectation_values
- Methods that already exist here:
- ll
hammannr marked this conversation as resolved.
Show resolved Hide resolved
- store_data
- fit
- get_parameter_list
hammannr marked this conversation as resolved.
Show resolved Hide resolved

:param data: pre-set data of the model
:param parameter_definition: definition of the parameters of the model
:type parameter_definition: dict or list
:param confidence_level: confidence level for confidence intervals
:type confidence_level: float
:param confidence_interval_kind: kind of confidence interval to compute
:type confidence_interval_kind: str
:param confidence_interval_threshold: threshold for confidence interval
:type confidence_interval_threshold: Callable[[float], float]
"""

_data = None
_config = dict()
dachengx marked this conversation as resolved.
Show resolved Hide resolved
_confidence_level = 0.9
_confidence_interval_kind = None

def __init__(
self,
data = None,
Expand All @@ -72,13 +73,16 @@ def __init__(
if data is not None:
self.data = data
self._confidence_level = confidence_level
if confidence_interval_kind not in {"central", "upper", "lower"}:
raise ValueError("confidence_interval_kind must be one of central, upper, lower")
self._confidence_interval_kind = confidence_interval_kind
self.confidence_interval_threshold = confidence_interval_threshold
self._define_parameters(parameter_definition)

self._check_ll_and_generate_data_signature()

def _define_parameters(self, parameter_definition):
"""Initialize the parameters of the model"""
if parameter_definition is None:
self.parameters = Parameters()
elif isinstance(parameter_definition, dict):
Expand All @@ -89,18 +93,21 @@ def _define_parameters(self, parameter_definition):
raise RuntimeError("parameter_definition must be dict or list")

def _check_ll_and_generate_data_signature(self):
"""Check that the likelihood and generate_data functions have the same signature"""
ll_params = set(inspect.signature(self._ll).parameters)
generate_data_params = set(inspect.signature(self._generate_data).parameters)
if ll_params != generate_data_params:
raise AssertionError(
"ll and generate_data must have the same signature (parameters)")

def _ll(self, **kwargs) -> float:
"""Likelihood function, returns the loglikelihood for the given parameters."""
raise NotImplementedError(
"You must write a likelihood function (_ll) for your statistical model"
" or use a subclass where it is written for you")

def _generate_data(self, **kwargs):
"""Generate data for the given parameters."""
raise NotImplementedError(
"You must write a data-generation method (_generate_data) for your statistical model"
" or use a subclass where it is written for you")
Expand All @@ -112,8 +119,9 @@ def ll(self, **kwargs) -> float:
The parameters are passed as keyword arguments, positional arguments are not possible.
If a parameter is not given, the default value is used.

Returns:
float: Likelihood value
:param kwargs: keyword arguments for the parameters
:return: likelihood value
:rtype: float
"""
parameters = self.parameters(**kwargs)
return self._ll(**parameters)
Expand All @@ -122,16 +130,11 @@ def generate_data(self, **kwargs):
"""
Generate data for the given parameters.
The parameters are passed as keyword arguments, positional arguments are not possible.
If a parameter is not given, the default value is used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would keep singular to emphasize that it is the default value singular for that parameter (i.e. not for all parameters)

If a parameter is not given, the default values are used.

Raises:
ValueError: If the parameters are not within the fit limits

Returns:
Data
:raises ValueError: If the parameters are not within the fit limits
:return: generated data
"""
# CAUTION:
# This implementation won't allow you to call generate_data by positional arguments.
hammannr marked this conversation as resolved.
Show resolved Hide resolved
if not self.parameters.values_in_fit_limits(**kwargs):
raise ValueError("Values are not within fit limits")
generate_values = self.parameters(**kwargs)
Expand All @@ -150,23 +153,29 @@ def data(self):

@data.setter
def data(self, data):
"""
Simple setter for a data-set-- mainly here so it can be over-ridden for special needs.
Data-sets are expected to be in the form of a list of one or more structured arrays,
representing the data-sets of one or more likelihood terms.
"""
"""data setter"""
self._data = data
self.is_data_set = True

def store_data(
self, file_name, data_list, data_name_list=None, metadata = None):
self,
file_name, data_list, data_name_list=None, metadata=None):
"""
Store a list of datasets (each on the form of a list of one or more structured arrays)
Using inference_interface, but included here to allow over-writing.
structure would be: [[datasets1], [datasets2], ..., [datasetsn]]
where each of datasets is a list of structured arrays
if you specify, it is set, if not it will read from self.get_likelihood_term_names
if not defined, it will be ["0", "1", ..., "n-1"]
The structure would be: [[datasets1], [datasets2], ..., [datasetsn]],
where each of datasets is a list of structured arrays.
If you specify, it is set, if not it will read from self.get_likelihood_term_names.
If not defined, it will be ["0", "1", ..., "n-1"]. The metadata is optional.

:param file_name: name of the file to store the data in
:type file_name: str
:param data_list: list of datasets
:type data_list: list
:param data_name_list: list of names of the datasets
:type data_name_list: list
:param metadata: metadata to store with the data
:type metadata: dict
"""
if data_name_list is None:
if hasattr(self, "likelihood_names"):
Expand All @@ -178,6 +187,11 @@ def store_data(
toydata_to_file(file_name, data_list, data_name_list, **kw)

def get_expectation_values(self, **parameter_values):
"""
Get the expectation values for the of the measurement.
hammannr marked this conversation as resolved.
Show resolved Hide resolved

:param parameter_values: values of the parameters
"""
return NotImplementedError("get_expectation_values is optional to implement")

@property
Expand All @@ -187,25 +201,37 @@ def nominal_expectation_values(self):

For this to work, you must implement `get_expectation_values`.
"""
return self.get_expectation_values() # no kwargs for nominal
# no kwargs for nominal
return self.get_expectation_values()
hammannr marked this conversation as resolved.
Show resolved Hide resolved

def get_likelihood_term_from_name(self, likelihood_name):
def get_likelihood_term_from_name(self, likelihood_name: str):
"""
Returns the index of a likelihood term if the likelihood has several names

dachengx marked this conversation as resolved.
Show resolved Hide resolved
:param likelihood_name: name of the likelihood term
:type likelihood_name: str
:return: index of the likelihood term
:rtype: dict
"""
if hasattr(self, "likelihood_names"):
likelihood_names = self.likelihood_names
return {n:i for i,n in enumerate(likelihood_names)}[likelihood_name]
return {n: i for i, n in enumerate(likelihood_names)}[likelihood_name]
else:
raise NotImplementedError("The attribute likelihood_names is not defined.")

def get_parameter_list(self):
"""
Returns a set of all parameters that the generate_data and likelihood accepts
"""
"""Returns a set of all parameters that the generate_data and likelihood accepts"""
return self.parameters.names

def make_objective(self, minus=True, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just seeing it now. The kwargs are not used I think (the line is commented out). Should we remove them and the commented line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why commented call_kwargs.update(kwargs) out?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not needed for iminuit since we tell it which parameters are fixed it won't call other values.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to make things clear we should remove the kwargs now from make_objective and when it is called, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

"""
dachengx marked this conversation as resolved.
Show resolved Hide resolved
Make a function that can be passed to Minuit

:param minus: if True, the function is multiplied by -1
:type minus: bool
:return: function that can be passed to Minuit
:rtype: Callable
"""
sign = -1 if minus else 1

def cost(args):
Expand All @@ -222,11 +248,17 @@ def cost(args):
@_needs_data
def fit(self, verbose=False, **kwargs) -> Tuple[dict, float]:
"""
Fit the model to the data by maximizing the likelihood
returns a dict containing best-fit values of each parameter,
Fit the model to the data by maximizing the likelihood.
Return a dict containing best-fit values of each parameter,
and the value of the likelihood evaluated there.
While the optimization is a minimization,
the likelihood returned is the _maximum_ of the likelihood.
the likelihood returned is the __maximum__ of the likelihood.

:param verbose: if True, print the Minuit object
:type verbose: bool
:return:
best-fit values of each parameter,
and the value of the likelihood evaluated there
"""
fixed_parameters = list(kwargs.keys())
guesses = self.parameters.fit_guesses
Expand Down Expand Up @@ -261,18 +293,32 @@ def _confidence_interval_checks(
confidence_interval_kind: str,
**kwargs):
"""
helper function for confidence_interval that does the input checks and returns bounds +
Helper function for confidence_interval that does the input checks and returns bounds

:param poi_name: name of the parameter of interest
:type poi_name: str
:param parameter_interval_bounds:
range in which to search for the confidence interval edges
:type parameter_interval_bounds: Tuple[float, float]
:param confidence_level: confidence level for confidence intervals
:type confidence_level: float
:param confidence_interval_kind: kind of confidence interval to compute
:type confidence_interval_kind: str
:return: confidence interval kind, confidence interval threshold, parameter interval bounds
"""
if confidence_level is None:
confidence_level = self._confidence_level
if confidence_interval_kind is None:
confidence_interval_kind = self._confidence_interval_kind

mask = (confidence_level > 0) and (confidence_level < 1)
assert mask, "the confidence level must lie between 0 and 1"
if (confidence_level < 0) or (confidence_level > 1):
raise ValueError("confidence_level must be between 0 and 1")

parameter_of_interest = self.parameters[poi_name]
assert parameter_of_interest.fittable, "The parameter of interest must be fittable"
assert poi_name not in kwargs, "you cannot set the parameter you're constraining"
if not parameter_of_interest.fittable:
raise ValueError("The parameter of interest must be fittable")
if poi_name in kwargs:
raise ValueError("You cannot set the parameter you're constraining")

if parameter_interval_bounds is None:
parameter_interval_bounds = parameter_of_interest.parameter_interval_bounds
Expand Down Expand Up @@ -318,17 +364,24 @@ def confidence_interval(
confidence_interval_kind: str = None,
**kwargs) -> Tuple[float, float]:
"""
Uses self.fit to compute confidence intervals for a certain named parameter

poi_name: string, name of fittable parameter of the model
parameter_interval_bounds: range in which to search for the confidence interval edges.
May be specified as:
- setting the property "parameter_interval_bounds" for the parameter
- passing a list here
Uses self.fit to compute confidence intervals for a certain named parameter.
If the parameter is a rate parameter, and the model has expectation values implemented,
the bounds will be interpreted as bounds on the expectation value
(so that the range in the fit is parameter_interval_bounds/mus)
otherwise the bound is taken as-is.
the bounds will be interpreted as bounds on the expectation value,
hammannr marked this conversation as resolved.
Show resolved Hide resolved
so that the range in the fit is parameter_interval_bounds/mus.
Otherwise the bound is taken as-is.

:param poi_name: name of the parameter of interest
:type poi_name: str
:param parameter_interval_bounds:
range in which to search for the confidence interval edges
May be specified as:
dachengx marked this conversation as resolved.
Show resolved Hide resolved
- setting the property "parameter_interval_bounds" for the parameter
- passing a list here
:type parameter_interval_bounds: Tuple[float, float]
:param confidence_level: confidence level for confidence intervals
:type confidence_level: float
:param confidence_interval_kind: kind of confidence interval to compute
:type confidence_interval_kind: str
"""

ci_objects = self._confidence_interval_checks(
Expand Down Expand Up @@ -381,9 +434,14 @@ class MinuitWrap:
"""
Wrapper for functions to be called by Minuit.
Initialized with a function f and a Parameters instance.

:param f: function to be wrapped
:type f: Callable
:param parameters: parameters of the model
:type parameters: Parameters
"""

def __init__(self, f, parameters: Parameters):
def __init__(self, f: Callable, parameters: Parameters):
dachengx marked this conversation as resolved.
Show resolved Hide resolved
self.func = f
self.s_args = parameters.names
self._parameters = {p.name: p.fit_limits for p in parameters}
Expand Down
2 changes: 1 addition & 1 deletion docs/make_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
make clean
rm -r source/reference
sphinx-apidoc -o source/reference ../alea
make html
make html #SPHINXOPTS="-W --keep-going -n"
4 changes: 0 additions & 4 deletions docs/source/reference/.gitignore

This file was deleted.

21 changes: 21 additions & 0 deletions docs/source/reference/alea.examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
alea.examples package
=====================

Submodules
----------

alea.examples.gaussian\_model module
------------------------------------

.. automodule:: alea.examples.gaussian_model
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

.. automodule:: alea.examples
:members:
:undoc-members:
:show-inheritance:
Loading