Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: parse JUJU_* env in one place #1313

Merged
merged 26 commits into from
Aug 23, 2024

Conversation

IronCore864
Copy link
Contributor

Parse all JUJU_* env vars in one place.

Closes #1075.


Note: This draft does the minimum changes by creating a dataclass, no other logic in main.py is changed. Variables are not grouped into smaller dataclasses, and all types are string just as they were in os.environ.get().

There will be another alternative draft with variable grouping and typing.


More Explanations

1 Grouping

I think it's fine without it but I'm not against it either. If there is a solid reason for grouping, I'm OK with it, but here are my thoughts:

  1. Value added: It seems only JUJU_NOTICE_* and JUJU_SECRET_* can be grouped into a smaller dataclass. Maybe we can argue that JUJU_RELATION and JUJU_RELATION_ID can be grouped too, but then the first needs to be renamed to JUJU_RELATION_NAME which deviates from the acutal env variable name. So, most env vars can't really be grouped, which means the _JujuContext dataclass can't be shortened or improved readability-wise by grouping. And when using it, it's not too much difference, comparing juju_context_obj.juju_notice_id and juju_context_obj.juju_notice.id. So I think the value added by grouping is insignificant.
  2. Logic: Without grouping, all variables are on the same level in parallel, similar to each other. If we group like _JujuContext.notice.id and _JujuContext.juju_charm_dir, it might indicate that _JujuContext owns notice and juju_charm_dir, they are on the same level, and there is another object that is notice who owns id, while in fact it's not the case, they are all env vars.
  3. Last but not least, the source comes from here where it's just a list of key/value pairs with no grouping, so I think maybe it's better to stay the same.

2 Type

I put all fields as str in this draft, because:

  1. I wanted to put some fields as int in the first place but after investigating, only very few fields are actually non-str, like JUJU_SECRET_REVISION; and some values are processed based on the env var, like juju relation id. So, maybe it's easier to keep them as they are and handle the type conversion whenever needed (mostly aren't). There are on extra plus side and that is we don't need to do extra processing if the key is JUJU_SECRET_REVISION, keeping the from_env method simple without key-name related logic.
  2. Since the whole _JujuContext dataclass comes from env vars, it's intuitive to think all fields are strings.
  3. The source is all strings.

Some explanation on Optional or not, and default values, becaues it may seem strange at first glace, and these logic is not from Juju but from ops.main, how we are using them currently:

  1. Optional: Some fields are not Optional because in main.py the current usage of them is treating them like they surely exist (os.environ[key]), so I kept them as type str instead of Optional[str].
  2. Some fields must be Optional with None value because in the current logic there are tests like if key in os.environ.
  3. Some fields are used as args to pass to other things and they can be None so I left these type all as Optional that can be None.

chore: poc parsing juju env in one place

chore: refactor _JuJuContext to a dataclass
@tonyandrewmeyer
Copy link
Contributor

For what it's worth, I agree that it's not worth grouping (flat is better than nested). However, I do think it's worth doing the type conversion in the class, as well as any split type logic.

I'd also suggest that the juju_ prefix for the attribute names isn't needed, since it's implied by the fact that it's in _JujuContext.

@dimaqq
Copy link
Contributor

dimaqq commented Aug 13, 2024

Consider

	// relationId identifies the relation for which a relation hook is
	// executing. If it is -1, the context is not running a relation hook;
	// otherwise, its value must be a valid key into the relations map.
	relationId int

If we're going to cast or interpret the values, we gotta handle these special cases too. In this case, I think -1 ==> None.

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's finalise the decision of what approach to use in the daily sync today.

Note that there are also a bunch of uses of os.environ and os.getenv is other files:

framework.py
655:        debug_at = os.environ.get('JUJU_DEBUG_AT')

jujuversion.py
104:        v = os.environ.get('JUJU_VERSION')

model.py
3234:        if 'JUJU_RELATION_ID' in os.environ and 'JUJU_REMOTE_APP' in os.environ:
3235:            event_relation_id = int(os.environ['JUJU_RELATION_ID'].split(':')[-1])

model.py
3161:        unit_name_ = unit_name or os.getenv('JUJU_UNIT_NAME')
3167:        self.model_name: str = model_name or typing.cast(str, os.getenv('JUJU_MODEL_NAME'))
3168:        self.model_uuid: str = model_uuid or typing.cast(str, os.getenv('JUJU_MODEL_UUID'))
3238:                return os.getenv('JUJU_REMOTE_APP') or None

ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
@IronCore864
Copy link
Contributor Author

After discussion, we will use a flat structure without nested objects, and we will put all fields as Optional[str] or Optional[int].

@IronCore864 IronCore864 changed the title feat: poc parsing juju env in one place feat: parse JUJU_* env in one place Aug 14, 2024
@IronCore864
Copy link
Contributor Author

Notes:

  1. _JujuContext.from_environ(dict(os.environ)) looks weird, because strictly speaking, it's not from_environ but from_dict. The name from_environ is used to be consistent with a few other usages in the code base; a Dict is used instead of os._Environ per previous code review suggestions to make it easy to be mocked in tests.
  2. Where to put _JujuContext: I think it makes sense to put it in main.py because it's first used there. But no matter where I put it, I can't bypass circular imports in jujuversion.py, so I left it where it's most logical and used TYPE_CHECKING to fix the import for typing checks. However, this breaks sphinx doc build. How can we solve this?

Changes:

_JujuContext:

  • Every field is of type Optional[str] or Optional[int] and defaults to None, according to the code review and our daily discussion.
  • The only two int-type fields are relation_id and secret_revision, where I did not handle the case that the env var's value exists but is not a number. Should we handle this case? Or let it throw a ValueError is OK?
  • A few more env vars used in model.py and JujuVersion are added, except JUJU_DEBUG_AT (used in framework.py) and OPERATOR_DISPATCH (used in main.py), because they are not in the source here and here.
  • _JujuContext is instantiated only once in _Manager, then passed to _Dispatcher and _ModelBackend, the latter also passes it to _Container. These classes will pass it to JujuVersion too.

JujuVersion:

  • The original from_environ method is renamed to from_context because it's not appropriate any more. from_juju_context was also considered, but JujuVersion.from_juju_context(juju_context) seemed too verbose and duplicated, hence JujuVersion.from_context(juju_context).
  • The original if v is None: v = '0.0.0 logic is removed; default value is handled now in _JujuContext.

model.py:

  • There were a few os.environ / os.getenv in it, all refactored to use _JujuContext.

test/*:

  • Add tests for JujuVersion.from_context.
  • Refactor tests to make them pass.
  • Add tests for _JujuContext.from_environ.

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start, thanks. A few initial comments.

ops/jujuversion.py Outdated Show resolved Hide resolved
ops/jujuversion.py Outdated Show resolved Hide resolved
ops/jujuversion.py Outdated Show resolved Hide resolved
ops/model.py Outdated Show resolved Hide resolved
ops/model.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
test/test_model.py Outdated Show resolved Hide resolved
test/test_testing.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/model.py Outdated Show resolved Hide resolved
@IronCore864
Copy link
Contributor Author

IronCore864 commented Aug 16, 2024

db-charm-tests failed because the charm imports JujuVersion from ops.model instead of ops.jujuversion.

Issue created here, and a fixing PR submitted.

@IronCore864 IronCore864 marked this pull request as ready for review August 19, 2024 06:20
Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few small comments.

I tried this out on Juju 3.6, LXD, but only with a few events, not exhaustively. Everything looked good for the ones I tried.

ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/main.py Outdated Show resolved Hide resolved
ops/model.py Show resolved Hide resolved
ops/model.py Show resolved Hide resolved
test/test_model.py Outdated Show resolved Hide resolved
ops/jujuversion.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Just a couple more requests for file moving (sorry for the back and forth!).

ops/jujuversion.py Outdated Show resolved Hide resolved
ops/jujucontext.py Show resolved Hide resolved
ops/jujucontext.py Outdated Show resolved Hide resolved
ops/model.py Show resolved Hide resolved
test/test_main.py Outdated Show resolved Hide resolved
test/test_main.py Outdated Show resolved Hide resolved
@benhoyt
Copy link
Collaborator

benhoyt commented Aug 21, 2024

Decision was to put JUJU_DEBUG_AT in _JujuContext.debug_at.

@IronCore864
Copy link
Contributor Author

IronCore864 commented Aug 21, 2024

All comments resolved. Major changes:

  • JUJU_DEBUG_AT added into ops/jujucontext.py as _JujuContext.debug_at, also added a test for it, ops/framework.py and tests refactored accordingly. Now when instantiating a Framework object, a _JujuContext.debug_at shall be passed in as a keyword argument (Optional for backward compatibility because some charms' test does import ops.Framework then instantiate it).
  • ops/jujuversion.py restored.

Copy link
Contributor

@sed-i sed-i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So happy to see this PR!

"""

@classmethod
def from_dict(cls, env: Mapping[str, Any]) -> '_JujuContext':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design question:

The half-baked idea I imagined was that from_dict returns a Union of types, and the caller always uses a match expression on the result. It would imply py3.10 (ubuntu 22.04) which isn't great, but I wonder (a) if it's a feasible design goal in the first place, and (b) if we should plan the design today with match in mind.

I.e. it would be great if this new class brought significant new added value over poking into os.environ directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, there is already some similar logic in main.py, specifically, in _get_event_args, where it tries to match the type of the event then pull related environment variables.

I think it would be great if we could somehow integrate it with JujuContext, maybe in the future we can build on top of this PR and implement that.

Regarding the scope of this PR, we had a short discussion today in the daily and we think this PR serves two purposes: 1, parse all ENV vars from a single place (previously it was scattered across main.py and framework.py); and 2, provide some unified object so that others can build on top of it if they want to implement some experiments.

So, for now, I'm sorry that there is no implementation of the event type matching thingy and we are going to merge this _JujuContext. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The half-baked idea I imagined was that from_dict returns a Union of types, and the caller always uses a match expression on the result.

If the intention is to find out which event Juju has emitted, then it's much simpler to just look at that directly rather than trying to match against which context variables have been set. If the intention is to get back a collection of types (e.g. which fields are set are the same with different events - start and stop would be setting the same) then I'm not sure I really see the use-case.

I don't think it's good to have a hierarchy of context objects when there's already a 1:1 mapping of what that would be in the hierarchy of event classes. A SecretChangedJujuContext would only be useful for SecretChangedEvent, for example.

As @IronCore864 mentioned, _get_event_args provides the "given a context, what are the arguments needed to create an event object". It would be simple enough for someone to subclass _JujuContext to add a to_event method (or whatever), if they wanted to build an "alt ops" package.

What would be most useful in that sort of situation, I believe, is being able to create SecretChangedEvent (and on on) objects without opting in to the whole framework/handle system (and registering, in particular). My feeling is that's what we could provide (not this cycle) to continue making it easier for people to explore alternative ways to do event handling without re-implementing everything.

I.e. it would be great if this new class brought significant new added value over poking into os.environ directly.

"significant" is subjective, of course, but I think there is value in not just collecting all the uses together as @IronCore864 mentioned, but also doing all the conversion so that you have a bunch of Python objects rather than a bunch of strings. With this, plus the similar hook tool wrapping in the model/model backend (the border of those could be a bit cleaner) that's the main interface with Juju (other than Pebble).

Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I'm fine with this as-is, but left a couple of small suggestions - feel free to discard those if you prefer.

test/test_framework.py Outdated Show resolved Hide resolved
ops/framework.py Outdated Show resolved Hide resolved
test/test_helpers.py Show resolved Hide resolved
@IronCore864 IronCore864 merged commit 5d74b82 into canonical:main Aug 23, 2024
32 checks passed
@IronCore864 IronCore864 deleted the parse-juju-env-vars branch August 23, 2024 02:33
tonyandrewmeyer added a commit to canonical/ops-scenario that referenced this pull request Aug 26, 2024
The next release of ops has some backwards-incompatible changes to
private methods. This PR does the minimum to keep Scenario working with
the current version of ops and the next release.

I'll open a ticket for doing a nicer version of this where the new
`_JujuContext` class is used (which would presumably mean requiring the
new version of ops). But this will let people continue upgrading their
ops as long as they're using the latest 6.x of Scenario.

The relevant ops PR is: canonical/operator#1313
samuelallan72 added a commit to canonical/charm-simple-streams that referenced this pull request Sep 3, 2024
As of canonical/operator#1313 ,
the test harness inserts JUJU_VERSION=0.0.0 into os.environ,
so we must account for this in the unit tests.
Comment on lines +41 to +42
if 'JUJU_VERSION' not in os.environ:
os.environ['JUJU_VERSION'] = '0.0.0'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IronCore864 why does the test framework need to manipulate the environment now? The new context appears to already pick a default if JUJU_VERSION is not present.

cc @gabrielcocenza

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IronCore864 why does the test framework need to manipulate the environment now? The new context appears to already pick a default if JUJU_VERSION is not present.

Note that this is the internal tests for ops themselves, not ops.testing (Harness). It seems like no-one else should really care what the internal ops tests are doing. Does this actually impact you, or is it just something you noticed while running the ops tests or similar?

It's being set here because _JujuContext expects the environment variable to be set, because all supported Juju versions so do.

Copy link
Member

@gabrielcocenza gabrielcocenza Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually impact you, or is it just something you noticed while running the ops tests or similar?

Yes, it's affecting our unit tests that are checking the arguments that were called in subprocess. See an example at canonical/charm-simple-streams#22

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's not the code linked above (your tests will never run that), it's this similar code in ops.testing.

We failed to consider this 😞, and we must not have run this change against the full set of charms we know about 😞. It would indeed be better to have Harness not change the environment. It's not immediately clear to me what we should do now: we could change the behaviour back so that it doesn't, but if everyone that this impacts has then already changed their code by the time that fix gets out, then it's just doubling up the impact (although it is cleaner in general).

I'll talk with the team about this later today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, we're going to do an ops 2.16.1 change that reverts this behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that now it's fixed 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hook context: Add a clear boundary between ops and juju
8 participants