Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit #1

Merged
merged 6 commits into from
May 24, 2024
Merged

Initial commit #1

merged 6 commits into from
May 24, 2024

Conversation

knikolla
Copy link
Contributor

  • Implemented library for loading rates.yaml
  • Wrote simple tests for library
  • Updated README with description
  • Made the package installable
  • Updated current rates

- Implemented library for loading rates.yaml
- Wrote simple tests for library
- Updated README with description
- Made the package installable
- Updated current rates
rates.yaml Show resolved Hide resolved
Copy link
Member

@larsks larsks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bothered by the structure of rates.yaml. We're storing different data types in the same attribute, and we're storing different types of "objects" in the same list. Here, value is a float:

- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06

But here it's a boolean (although this should technically be spelled false):

- name: Charge for Stopped Instances
  history:
    - value: False
      from: 2023-06
      until: 2024-02

And later on it's an integer.

Is that ever going to bite us? It certainly makes the file difficult to validate, and difficult to parse in languages with strict typing. We're cheating a bit in the code by using yaml.BaseLoader, which returns everything as strings.

I wonder if we should split the file into sections:

rates:
- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06


feature_flags:
- name: Charge for Stopped Instances
  history:
    - value: false
      from: 2023-06
      until: 2024-02
    - value: true
      from: 2024-03

definitions:
- name: VCPUs in CPU SU
  history:
    - value: 1
      from: 2023-06

That would allow us to iterate over a thing (like rates) and know what to expect in the value field.

Maybe the content of rates.yaml is entirely internal to this code, and rather than YAML format it's actually "nerc-rates-YAML" format, and we don't expect it to be parsed by anything else, in which case none of the above concerns are relevant.

setup.cfg Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
src/nerc_rates/rates.py Outdated Show resolved Hide resolved
@knikolla
Copy link
Contributor Author

I am a little bothered by the structure of rates.yaml. We're storing different data types in the same attribute, and we're storing different types of "objects" in the same list. Here, value is a float:

- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06

But here it's a boolean (although this should technically be spelled false):

- name: Charge for Stopped Instances
  history:
    - value: False
      from: 2023-06
      until: 2024-02

And later on it's an integer.

Is that ever going to bite us? It certainly makes the file difficult to validate, and difficult to parse in languages with strict typing. We're cheating a bit in the code by using yaml.BaseLoader, which returns everything as strings.

I wonder if we should split the file into sections:

rates:
- name: CPU SU Rate
  history:
    - value: 0.013
      from: 2023-06


feature_flags:
- name: Charge for Stopped Instances
  history:
    - value: false
      from: 2023-06
      until: 2024-02
    - value: true
      from: 2024-03

definitions:
- name: VCPUs in CPU SU
  history:
    - value: 1
      from: 2023-06

That would allow us to iterate over a thing (like rates) and know what to expect in the value field.

Maybe the content of rates.yaml is entirely internal to this code, and rather than YAML format it's actually "nerc-rates-YAML" format, and we don't expect it to be parsed by anything else, in which case none of the above concerns are relevant.

@larsks Yeah, I was bothered by the inconsistent value types too, hence in the code I used the BaseLoader to keep everything as strings and let the downstream code accessing the specific values handle the conversion since it would know what it is trying to access.

Reading your comment, I see the ambiguity that that may lead to tools that are attempting to parse the YAML directly without using the python library, since those values appear to be floats, whereas the nerc-rates-YAML format treats everything as a string.

I tried an approach that divided things into sections initially, but I found that a better approach was to have a type field in each of the objects that defined the type stored in values (Decimal, Integer, Boolean), since the type of value is a clearer categorization of the value that is returned as opposed to the semantic meaning of the value.

I can reintroduce the type attribute and handle the conversion in the library, or I can just add " to all values and be explicit about their being strings.

Do you have a preference?

@larsks
Copy link
Member

larsks commented May 17, 2024

I can reintroduce the type attribute and handle the conversion in the library, or I can just add " to all values and be explicit about their being strings.

Do you have a preference?

I don't think a type attribute is the right way to go, because again that complicates validation (now we need a discriminator of some sort to identify expected value types).

I guess my preference in the short term would be to quote everything if it's expected to be a string, but I would combine that with validation in the code.

@larsks
Copy link
Member

larsks commented May 17, 2024

...but I would combine that with validation in the code.

E.g., like: https://github.com/larsks/nerc-rates/tree/feature/pydantic (which also drops setup.*).

The interesting bit is probably models.py (and the corresponding changes in rates.py).

@larsks
Copy link
Member

larsks commented May 17, 2024

@knikolla I've pushed changes that add support for validation using pydantic, along with specific validations for the ordering between from and until and for overlap between values in a history list.

@naved001 naved001 self-requested a review May 17, 2024 19:22
rates.yaml Outdated Show resolved Hide resolved
@larsks larsks force-pushed the master branch 5 times, most recently from 988dd62 to 4b9c96e Compare May 17, 2024 21:05
@larsks
Copy link
Member

larsks commented May 17, 2024

It looks like typing.Self was introduced in python 3.11, and we're running ci with 3.10. Fortunately, we can use typing_extensions.Self in either case.

larsks added 2 commits May 17, 2024 17:08
Drop setup.cfg and setup.py which are no longer required with modern
Python. I've moved all the metadata from setup.cfg into pyproject.toml.
Ensure that all values in rates.yaml are string values by
(a) editing the file to quote all values and (b) implementing pydantic
models for the rate data so that we can validate the values on input.
larsks added 3 commits May 17, 2024 18:08
Ensure that the `from` field for a given rate entry is always earlier than
the `until` field.
Ensure that date ranges in a `history` list do not overlap.
Raise an error if two rate items have the same name. E.g, this would
trigger an error:

```
- name: Test Rate
  history:
    - values: "1""
      from: 2023-06
      until: 2023-11
- name: Test Rate
  history:
    - values: "2""
      from: 2023-12
```
@larsks larsks merged commit fd2eb3b into CCI-MOC:main May 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants