Initial commit #1

knikolla · 2024-05-14T14:56:14Z

Implemented library for loading rates.yaml
Wrote simple tests for library
Updated README with description
Made the package installable
Updated current rates

- Implemented library for loading rates.yaml - Wrote simple tests for library - Updated README with description - Made the package installable - Updated current rates

rates.yaml

larsks

I am a little bothered by the structure of rates.yaml. We're storing different data types in the same attribute, and we're storing different types of "objects" in the same list. Here, value is a float:

- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06

But here it's a boolean (although this should technically be spelled false):

- name: Charge for Stopped Instances
  history:
    - value: False
      from: 2023-06
      until: 2024-02

And later on it's an integer.

Is that ever going to bite us? It certainly makes the file difficult to validate, and difficult to parse in languages with strict typing. We're cheating a bit in the code by using yaml.BaseLoader, which returns everything as strings.

I wonder if we should split the file into sections:

rates:
- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06


feature_flags:
- name: Charge for Stopped Instances
  history:
    - value: false
      from: 2023-06
      until: 2024-02
    - value: true
      from: 2024-03

definitions:
- name: VCPUs in CPU SU
  history:
    - value: 1
      from: 2023-06

That would allow us to iterate over a thing (like rates) and know what to expect in the value field.

Maybe the content of rates.yaml is entirely internal to this code, and rather than YAML format it's actually "nerc-rates-YAML" format, and we don't expect it to be parsed by anything else, in which case none of the above concerns are relevant.

setup.cfg

setup.py

src/nerc_rates/rates.py

knikolla · 2024-05-17T16:28:29Z

I am a little bothered by the structure of rates.yaml. We're storing different data types in the same attribute, and we're storing different types of "objects" in the same list. Here, value is a float:
- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06
But here it's a boolean (although this should technically be spelled false):
- name: Charge for Stopped Instances
  history:
    - value: False
      from: 2023-06
      until: 2024-02
And later on it's an integer.

Is that ever going to bite us? It certainly makes the file difficult to validate, and difficult to parse in languages with strict typing. We're cheating a bit in the code by using yaml.BaseLoader, which returns everything as strings.

I wonder if we should split the file into sections:
rates:
- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06


feature_flags:
- name: Charge for Stopped Instances
  history:
    - value: false
      from: 2023-06
      until: 2024-02
    - value: true
      from: 2024-03

definitions:
- name: VCPUs in CPU SU
  history:
    - value: 1
      from: 2023-06
That would allow us to iterate over a thing (like rates) and know what to expect in the value field.

Maybe the content of rates.yaml is entirely internal to this code, and rather than YAML format it's actually "nerc-rates-YAML" format, and we don't expect it to be parsed by anything else, in which case none of the above concerns are relevant.

@larsks Yeah, I was bothered by the inconsistent value types too, hence in the code I used the BaseLoader to keep everything as strings and let the downstream code accessing the specific values handle the conversion since it would know what it is trying to access.

Reading your comment, I see the ambiguity that that may lead to tools that are attempting to parse the YAML directly without using the python library, since those values appear to be floats, whereas the nerc-rates-YAML format treats everything as a string.

I tried an approach that divided things into sections initially, but I found that a better approach was to have a type field in each of the objects that defined the type stored in values (Decimal, Integer, Boolean), since the type of value is a clearer categorization of the value that is returned as opposed to the semantic meaning of the value.

I can reintroduce the type attribute and handle the conversion in the library, or I can just add " to all values and be explicit about their being strings.

Do you have a preference?

larsks · 2024-05-17T18:13:19Z

I can reintroduce the type attribute and handle the conversion in the library, or I can just add " to all values and be explicit about their being strings.

Do you have a preference?

I don't think a type attribute is the right way to go, because again that complicates validation (now we need a discriminator of some sort to identify expected value types).

I guess my preference in the short term would be to quote everything if it's expected to be a string, but I would combine that with validation in the code.

larsks · 2024-05-17T18:33:31Z

...but I would combine that with validation in the code.

E.g., like: https://github.com/larsks/nerc-rates/tree/feature/pydantic (which also drops setup.*).

The interesting bit is probably models.py (and the corresponding changes in rates.py).

larsks · 2024-05-17T19:22:35Z

@knikolla I've pushed changes that add support for validation using pydantic, along with specific validations for the ordering between from and until and for overlap between values in a history list.

rates.yaml

larsks · 2024-05-17T21:07:15Z

It looks like typing.Self was introduced in python 3.11, and we're running ci with 3.10. Fortunately, we can use typing_extensions.Self in either case.

Drop setup.cfg and setup.py which are no longer required with modern Python. I've moved all the metadata from setup.cfg into pyproject.toml.

Ensure that all values in rates.yaml are string values by (a) editing the file to quote all values and (b) implementing pydantic models for the rate data so that we can validate the values on input.

Ensure that the `from` field for a given rate entry is always earlier than the `until` field.

Ensure that date ranges in a `history` list do not overlap.

Raise an error if two rate items have the same name. E.g, this would trigger an error: ``` - name: Test Rate history: - values: "1"" from: 2023-06 until: 2023-11 - name: Test Rate history: - values: "2"" from: 2023-12 ```

Initial commit

234b41e

- Implemented library for loading rates.yaml - Wrote simple tests for library - Updated README with description - Made the package installable - Updated current rates

knikolla requested review from larsks, naved001 and joachimweyl May 14, 2024 14:57

naved001 approved these changes May 14, 2024

View reviewed changes

rates.yaml Show resolved Hide resolved

larsks reviewed May 17, 2024

View reviewed changes

setup.cfg Outdated Show resolved Hide resolved

setup.py Outdated Show resolved Hide resolved

src/nerc_rates/rates.py Outdated Show resolved Hide resolved

larsks force-pushed the master branch from 15d90db to 234b41e Compare May 17, 2024 18:30

naved001 self-requested a review May 17, 2024 19:22

larsks force-pushed the master branch from 7319223 to 978c0b6 Compare May 17, 2024 19:26

knikolla commented May 17, 2024

View reviewed changes

rates.yaml Outdated Show resolved Hide resolved

larsks force-pushed the master branch 5 times, most recently from 988dd62 to 4b9c96e Compare May 17, 2024 21:05

larsks added 2 commits May 17, 2024 17:08

Use pyproject.toml exclusively for package metadata

d33ae5d

Drop setup.cfg and setup.py which are no longer required with modern Python. I've moved all the metadata from setup.cfg into pyproject.toml.

Use pydantic to validate input data

0c03cab

Ensure that all values in rates.yaml are string values by (a) editing the file to quote all values and (b) implementing pydantic models for the rate data so that we can validate the values on input.

larsks force-pushed the master branch from 4b9c96e to 9182aec Compare May 17, 2024 21:13

larsks added 3 commits May 17, 2024 18:08

Add validation for ordering of from and until

7b7c500

Ensure that the `from` field for a given rate entry is always earlier than the `until` field.

Add validation for overlapping date ranges

edbd5dc

Ensure that date ranges in a `history` list do not overlap.

Check for duplicate keys in input data

69ab371

Raise an error if two rate items have the same name. E.g, this would trigger an error: ``` - name: Test Rate history: - values: "1"" from: 2023-06 until: 2023-11 - name: Test Rate history: - values: "2"" from: 2023-12 ```

larsks force-pushed the master branch from 9182aec to 69ab371 Compare May 17, 2024 22:10

naved001 approved these changes May 23, 2024

View reviewed changes

larsks approved these changes May 24, 2024

View reviewed changes

larsks merged commit fd2eb3b into CCI-MOC:main May 24, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit #1

Initial commit #1

knikolla commented May 14, 2024

larsks left a comment

knikolla commented May 17, 2024

larsks commented May 17, 2024

larsks commented May 17, 2024 •

edited

Loading

larsks commented May 17, 2024

larsks commented May 17, 2024 •

edited

Loading

Initial commit #1

Initial commit #1

Conversation

knikolla commented May 14, 2024

larsks left a comment

Choose a reason for hiding this comment

knikolla commented May 17, 2024

larsks commented May 17, 2024

larsks commented May 17, 2024 • edited Loading

larsks commented May 17, 2024

larsks commented May 17, 2024 • edited Loading

larsks commented May 17, 2024 •

edited

Loading

larsks commented May 17, 2024 •

edited

Loading