Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse parameters metadata and expose them in Python #681

Merged
merged 20 commits into from
Jul 16, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
command: |
pip install --upgrade pip twine wheel
pip install --editable .[test] --upgrade
# pip install --editable git+https://github.com/openfisca/country-template.git@BRANCH_NAME#egg=OpenFisca-Country-Template # use a specific branch of OpenFisca-Country-Template
# pip install --editable git+https://github.com/openfisca/country-template.git@BRANCH_NAME#egg=OpenFisca-Country-Template # use a specific branch of OpenFisca-Country-Template

- save_cache:
key: v1-py2-{{ checksum "setup.py" }}
Expand Down Expand Up @@ -89,7 +89,7 @@ jobs:
command: |
pip install --upgrade pip twine wheel
pip install --editable .[test] --upgrade
# pip install --editable git+https://github.com/openfisca/country-template.git@BRANCH_NAME#egg=OpenFisca-Country-Template # use a specific branch of OpenFisca-Country-Template
# pip install --editable git+https://github.com/openfisca/country-template.git@BRANCH_NAME#egg=OpenFisca-Country-Template # use a specific branch of OpenFisca-Country-Template

- save_cache:
key: v1-py3-{{ checksum "setup.py" }}
Expand Down
34 changes: 34 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,39 @@
# Changelog

### 23.3.0 [#681](https://github.com/openfisca/openfisca-core/pull/681)

* Change the way metadata are declared for Parameter.

Before:
```YAML
description: Age of retirement
reference: https://wikipedia.org/wiki/retirement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be interesting to give an example with several references ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would such an example illustrate?

Right now the metadata are just a pass-trough, and core has no opinion about the content of this reference, so I'm not sure about the value of making the changelog more complex by adding an other slightly different example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was asking because it was the first thing that came into my mind when I saw the singular reference and that there was only one reference :)
It might just be a worry I have :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we should keep that in mind.

unit: year
values: (...)
```

After:
```YAML
description: Age of retirement
metadata:
reference: https://wikipedia.org/wiki/retirement
unit: year
values: (...)
```

_Setting `unit` and `reference` out of `metadata` is considered deprecated, but still works for backward compatibility._

* Allow legislation coders to define their own medatada

* Expose in the Python API
- Parameters metadata:
- e.g. `parameters.taxes.rate.metadata['unit']`
- Parameter value metadata:
- e.g. `parameters.taxes.rate.values_list[0].metadata['unit']`
- Parameter node description and metadata:
- e.g. `parameters.taxes.metadata['reference']`, `parameters.taxes.description`
- Note: Parameter descriptions (e.g. `parameters.taxes.rate.description`) were already exposed

## 23.2.0 [#689](https://github.com/openfisca/openfisca-core/pull/689)

* Introduce `TaxBenefitSystem.replace_variable`
Expand Down
124 changes: 84 additions & 40 deletions openfisca_core/parameters.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,10 +101,10 @@ class Parameter(object):
Instantiate a parameter with metadata:

>>> Parameter('rate', data = {
'description': 'Income tax rate applied on salaries'
'description': 'Income tax rate applied on salaries',
'values': {
"2015-01-01": {'value': 550, reference = 'http://taxes.gov/income_tax/2015'},
"2016-01-01": {'value': 600, reference = 'http://taxes.gov/income_tax/2016'}
"2015-01-01": {'value': 550, 'metadata': {'reference': 'http://taxes.gov/income_tax/2015'}},
"2016-01-01": {'value': 600, 'metadata': {'reference': 'http://taxes.gov/income_tax/2016'}}
}
})

Expand All @@ -117,15 +117,25 @@ def __init__(self, name, data, file_path = None):
self.name = name
self.file_path = file_path
_validate_parameter(self, data, data_type = dict)
self.values_history = self # Only for retro-compatibility
self.description = None
self.metadata = {}
self.values_history = self # Only for backward compatibility

# Normal parameter declaration: the values are declared under the 'values' key: parse the description and metadata.
if data.get('values'):
_validate_parameter(self, data, allowed_keys = set(['values', 'description', 'unit', 'reference']))
# 'unit' and 'reference' are only listed here for backward compatibility
_validate_parameter(self, data, allowed_keys = set(['values', 'description', 'metadata', 'unit', 'reference']))
self.description = data.get('description')
data = data['values']
_validate_parameter(self, data, data_type = dict)
_set_backward_compatibility_metadata(self, data)
self.metadata.update(data.get('metadata', {}))

instants = sorted(data.keys(), reverse = True) # sort in reverse chronological order
_validate_parameter(self, data['values'], data_type = dict)
values = data['values']

else: # Simplified parameter declaration: only values are provided
values = data

instants = sorted(values.keys(), reverse = True) # sort in reverse chronological order

values_list = []
for instant_str in instants:
Expand All @@ -135,14 +145,14 @@ def __init__(self, name, data, file_path = None):
.format(instant_str, self.name),
file_path)

instant_info = data[instant_str]
instant_info = values[instant_str]

# Ignore expected values, as they are just metadata
if instant_info == "expected" or isinstance(instant_info, dict) and instant_info.get("expected"):
continue

value_name = _compose_name(name, instant_str)
value_at_instant = ParameterAtInstant(value_name, instant_str, data = instant_info, file_path = file_path)
value_at_instant = ParameterAtInstant(value_name, instant_str, data = instant_info, file_path = self.file_path, metadata = self.metadata)
values_list.append(value_at_instant)

self.values_list = values_list
Expand Down Expand Up @@ -237,9 +247,10 @@ class ParameterAtInstant(object):
"""

_allowed_value_data_types = [int, float, bool, type(None)]
_allowed_keys = set(['value', 'unit', 'reference'])
# 'unit' and 'reference' are only listed here for backward compatibility
_allowed_keys = set(['value', 'metadata', 'unit', 'reference'])

def __init__(self, name, instant_str, data = None, file_path = None):
def __init__(self, name, instant_str, data = None, file_path = None, metadata = None):
"""
:param name: name of the parameter, e.g. "taxes.some_tax.some_param"
:param instant_str: Date of the value in the format `YYYY-MM-DD`.
Expand All @@ -248,6 +259,7 @@ def __init__(self, name, instant_str, data = None, file_path = None):
self.name = name
self.instant_str = instant_str
self.file_path = file_path
self.metadata = {}

# Accept { 2015-01-01: 4000 }
if not isinstance(data, dict) and type(data) in self._allowed_value_data_types:
Expand All @@ -257,6 +269,11 @@ def __init__(self, name, instant_str, data = None, file_path = None):
self.validate(data)
self.value = data['value']

if metadata is not None:
self.metadata.update(metadata) # Inherit metadata from Parameter
_set_backward_compatibility_metadata(self, data)
self.metadata.update(data.get('metadata', {}))

def validate(self, data):
_validate_parameter(self, data, data_type = dict, allowed_keys = self._allowed_keys)
try:
Expand Down Expand Up @@ -322,21 +339,33 @@ def __init__(self, name = "", directory_path = None, data = None, file_path = No
>>> node = ParameterNode('benefits', directory_path = '/path/to/country_package/parameters/benefits')
"""
self.name = name
self.children = {}
self.description = None
self.file_path = None
self.metadata = {}

if directory_path:
self.children = {}
self.file_path = directory_path
for child_name in os.listdir(directory_path):
child_path = os.path.join(directory_path, child_name)
if os.path.isfile(child_path):
child_name, ext = os.path.splitext(child_name)
# We ignore non-YAML files, and index.yaml files, curently used to store metadatas
if ext not in PARAM_FILE_EXTENSIONS or child_name == 'index':

# We ignore non-YAML files
if ext not in PARAM_FILE_EXTENSIONS:
continue

child_name_expanded = _compose_name(name, child_name)
child = load_parameter_file(child_path, child_name_expanded)
self.children[child_name] = child
setattr(self, child_name, child)
if child_name == 'index':
data = _load_yaml_file(child_path)
_validate_parameter(self, data, allowed_keys = ['metadata', 'description', 'reference'])
self.description = data.get('description', None)
_set_backward_compatibility_metadata(self, data)
self.metadata.update(data.get('metadata', {}))
else:
child_name_expanded = _compose_name(name, child_name)
child = load_parameter_file(child_path, child_name_expanded)
self.children[child_name] = child
setattr(self, child_name, child)

elif os.path.isdir(child_path):
child_name = os.path.basename(child_path)
Expand All @@ -348,11 +377,14 @@ def __init__(self, name = "", directory_path = None, data = None, file_path = No
else:
self.file_path = file_path
_validate_parameter(self, data, data_type = dict, allowed_keys = self._allowed_keys)
self.children = {}
# We allow to set a reference and a description for a node. It is however not recommanded, as it's only metadata and is not exposed in the legislation explorer.
data.pop('reference', None)
data.pop('description', None)
# We allow to set a reference and a description for a node.
self.description = data.get('description', None)
_set_backward_compatibility_metadata(self, data)
self.metadata.update(data.get('metadata', {}))
for child_name, child in data.items():
if child_name in {'unit', 'description', 'metadata', 'reference'}:
continue # do not treat metadata as subparameters. 'unit' and 'reference' are only listed here for backward compatibility

child_name = str(child_name)
child_name_expanded = _compose_name(name, child_name)
child = _parse_child(child_name_expanded, child, file_path)
Expand Down Expand Up @@ -396,7 +428,7 @@ def __repr__(self):
result = os.linesep.join(
[os.linesep.join(
[u"{}:", u"{}"]).format(name, indent(repr(value)))
for name, value in self.children.items()]
for name, value in sorted(self.children.items())]
)
# repr output must be encoded in Python 2, but not in Python 3
if sys.version_info < (3, 0):
Expand Down Expand Up @@ -617,7 +649,8 @@ class Scale(object):
"""
A parameter scale (for instance a marginal scale).
"""
_allowed_keys = set(['brackets', 'description', 'unit', 'reference'])
# 'unit' and 'reference' are only listed here for backward compatibility
_allowed_keys = set(['brackets', 'description', 'metadata', 'unit', 'reference'])

def __init__(self, name, data, file_path):
"""
Expand All @@ -629,6 +662,9 @@ def __init__(self, name, data, file_path):
self.file_path = file_path
_validate_parameter(self, data, data_type = dict, allowed_keys = self._allowed_keys)
self.description = data.get('description')
self.metadata = {}
_set_backward_compatibility_metadata(self, data)
self.metadata.update(data.get('metadata', {}))

if not isinstance(data['brackets'], list):
raise ParameterParsingError(
Expand Down Expand Up @@ -709,21 +745,10 @@ class Bracket(ParameterNode):
_allowed_keys = set(['amount', 'threshold', 'rate', 'average_rate', 'base'])


def load_parameter_file(file_path, name = ''):
"""
Load parameters from a YAML file (or a directory containing YAML files).

:returns: An instance of :any:`ParameterNode` or :any:`Scale` or :any:`Parameter`.
"""

if not os.path.exists(file_path):
raise ValueError("{} doest not exist".format(file_path).encode('utf-8'))
if os.path.isdir(file_path):
return ParameterNode(name, directory_path = file_path)

def _load_yaml_file(file_path):
with open(file_path, 'r') as f:
try:
data = yaml.load(f, Loader = Loader)
return yaml.load(f, Loader = Loader)
except (yaml.scanner.ScannerError, yaml.parser.ParserError):
stack_trace = traceback.format_exc()
raise ParameterParsingError(
Expand All @@ -732,6 +757,18 @@ def load_parameter_file(file_path, name = ''):
stack_trace
)


def load_parameter_file(file_path, name = ''):
"""
Load parameters from a YAML file (or a directory containing YAML files).

:returns: An instance of :any:`ParameterNode` or :any:`Scale` or :any:`Parameter`.
"""
if not os.path.exists(file_path):
raise ValueError("{} doest not exist".format(file_path).encode('utf-8'))
if os.path.isdir(file_path):
return ParameterNode(name, directory_path = file_path)
data = _load_yaml_file(file_path)
return _parse_child(name, data, file_path)


Expand Down Expand Up @@ -778,13 +815,20 @@ def _validate_parameter(parameter, data, data_type = None, allowed_keys = None):
)


def _set_backward_compatibility_metadata(parameter, data):
if data.get('unit') is not None:
parameter.metadata['unit'] = data['unit']
if data.get('reference') is not None:
parameter.metadata['reference'] = data['reference']


def contains_nan(vector):
if np.issubdtype(vector.dtype, np.record):
return any([contains_nan(vector[name]) for name in vector.dtype.names])
else:
return np.isnan(np.min(vector))
return np.isnan(vector).any()


# Only for retro-compatibility
# Only for backward compatibility
class ValuesHistory(Parameter):
pass
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

setup(
name = 'OpenFisca-Core',
version = '23.2.0',
version = '23.3.0',
author = 'OpenFisca Team',
author_email = 'contact@openfisca.fr',
classifiers = [
Expand All @@ -32,7 +32,7 @@
'test': [
'nose',
'flake8 >= 3.4.0, < 3.5.0',
'openfisca-country-template >= 3.0.0, < 4.0.0',
'openfisca-country-template >= 3.2.2rc0, < 4.0.0',
'openfisca-extension-template >= 1.1.3, < 2.0.0',
],
'tracker': [
Expand Down
19 changes: 19 additions & 0 deletions tests/core/test_parameters.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,22 @@ def test_parameter_repr():
tf.close()
tf_parameters = load_parameter_file(file_path = tf.name)
assert_equal(repr(parameters), repr(tf_parameters))


def test_parameters_metadata():
parameter = tax_benefit_system.parameters.benefits.basic_income
assert_equal(parameter.metadata['reference'], 'https://law.gov.example/basic-income/amount')
assert_equal(parameter.metadata['unit'], 'currency-EUR')
assert_equal(parameter.values_list[0].metadata['reference'], 'https://law.gov.example/basic-income/amount/2015-12')
assert_equal(parameter.values_list[0].metadata['unit'], 'currency-EUR')
scale = tax_benefit_system.parameters.taxes.social_security_contribution
assert_equal(scale.metadata['threshold_unit'], 'currency-EUR')
assert_equal(scale.metadata['rate_unit'], '/1')


def test_parameter_node_metadata():
parameter = tax_benefit_system.parameters.benefits
assert_equal(parameter.description, 'Social benefits')

parameter_2 = tax_benefit_system.parameters.taxes.housing_tax
assert_equal(parameter_2.description, 'Housing tax')