Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of empty command lists in Autoinstall YAML #2072

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

penguin359
Copy link

@penguin359 penguin359 commented Sep 2, 2024

This fixes an issue with an Autoinstall user-data YAML such as the following:

#cloud-config
autoinstall:
  version: 1
  late-commands:
    # Only entry commented out
    #- echo "Hello" > /etc/motd

This is being tracked in the following bug:
https://bugs.launchpad.net/subiquity/+bug/2078658

A test case was also added that demostrates the crash and verifies the fix.

  • Disabled some prints cluttering up test output
  • Added test cases for empty command lists
  • Don't attempt to enumerate over an empty command list

@Chris-Peterson444
Copy link
Contributor

What's the motivation for allowing an empty command list section? I would prefer it if we instead improved the validation error messaging by improving the json schema for the CmdListController class to not allow an empty object. An empty section feels unintentional to me (you can always just comment out the whole block).

Unrelated to the code changes: Have you signed the Canonical CLA? The CLA bot says no but sometimes a manual nudge is necessary to get it happy.

Copy link
Contributor

@Chris-Peterson444 Chris-Peterson444 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: we typically use conventional commits. e.g. "ci: add test case for empty cmdlist" or similar

subiquity/tests/api/test_api.py Outdated Show resolved Hide resolved
subiquity/tests/api/test_api.py Outdated Show resolved Hide resolved
@penguin359
Copy link
Author

@Chris-Peterson444 The reason is that the current behavior causes the installer to simply crash with an ugly backtrace that doesn't indicate anywhere in the output what line of the configuration file it is choking on or even that it was the autoinstall configuration. Even the validator does not tell you more than "NoneType is not an iterable" with no indication of what YAML key or line in the file was bad. That is a bad user experience that even a seasoned Linux veteran, which I consider myself, took more than an hour trying to figure out what was wrong. It was a simple fix, but this two-line addition to the logic makes it merely ignore those empty keys. Even if this isn't considered the correct fix, I would prefer a fix that at least indicated the line and/or key that is causing the validation to fail.

I have tried to submit the CLA, but I don't know what to fill in for the field: "Please add the Canonical Project Manager or contact". It won't let me submit it without that value. What should I put in there?

@penguin359
Copy link
Author

nitpick: we typically use conventional commits. e.g. "ci: add test case for empty cmdlist" or similar

As a "new to Canonical" contributor, I am happy to accept nitpicks. I can update that commit message in my next rebase.

@penguin359
Copy link
Author

If you do want this to be changed to a validation error, I can work on that. It will just take a little longer as I need to dig a little deeper into the guts of Subiquity.

@Chris-Peterson444
Copy link
Contributor

Chris-Peterson444 commented Sep 3, 2024

Indeed, the error messaging isn't great yet and very non-obvious for some cases, such as this one. The validator needs some work and thank you for spending the time to track this down and make it better!

I would suggest something like this:

diff --git a/subiquity/server/controllers/cmdlist.py b/subiquity/server/controllers/cmdlist.py
index 2729da5c..4f77e7df 100644
--- a/subiquity/server/controllers/cmdlist.py
+++ b/subiquity/server/controllers/cmdlist.py
@@ -55,6 +55,7 @@ class CmdListController(NonInteractiveController):
             "type": ["string", "array"],
             "items": {"type": "string"},
         },
+        "not": {},
     }
     builtin_cmds: Sequence[Command] = ()
     cmds: Sequence[Command] = ()

which will at least result in providing the offending section. Using your CI autoinstall as an example:

$ ./scripts/validate-autoinstall-user-data.py --no-expect-cloudconfig autoinstall.yaml
Malformed autoinstall in 'error-commands' section
Failure: The provided autoinstall config failed validation

or

$ ./scripts/validate-autoinstall-user-data.py -vvv --no-expect-cloudconfig autoinstall.yaml
start: subiquity/load_autoinstall_config:
start: subiquity/load_autoinstall_config/read_config:
finish: subiquity/load_autoinstall_config/read_config: SUCCESS:
start: subiquity/Reporting/load_autoinstall_data:
finish: subiquity/Reporting/load_autoinstall_data: SUCCESS:
start: subiquity/Integrity/load_autoinstall_data:
finish: subiquity/Integrity/load_autoinstall_data: SUCCESS:
start: subiquity/Error/load_autoinstall_data:
finish: subiquity/Error/load_autoinstall_data: FAIL: Malformed autoinstall in 'error-commands' section
finish: subiquity/load_autoinstall_config: FAIL: Malformed autoinstall in 'error-commands' section
Malformed autoinstall in 'error-commands' section
Traceback (most recent call last):
  File "/home/cpete/software/subiquity/scripts/../subiquity/server/controller.py", line 62, in validate_autoinstall
    jsonschema.validate(ai_data, self.autoinstall_schema)
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 1080, in validate
    raise error
jsonschema.exceptions.ValidationError: [] should not be valid under {}

Failed validating 'not' in schema:
    {'items': {'items': {'type': 'string'}, 'type': ['string', 'array']},
     'not': {},
     'type': 'array'}

On instance:
    []

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/cpete/software/subiquity/./scripts/validate-autoinstall-user-data.py", line 170, in verify_autoinstall
    app.load_autoinstall_config(only_early=True, context=None)
  File "/home/cpete/software/subiquity/scripts/../subiquitycore/context.py", line 159, in decorated_sync
    return meth(self, **kw)
           ^^^^^^^^^^^^^^^^
  File "/home/cpete/software/subiquity/scripts/../subiquity/server/server.py", line 750, in load_autoinstall_config
    self.controllers.Error.setup_autoinstall()
  File "/home/cpete/software/subiquity/scripts/../subiquity/server/controller.py", line 91, in setup_autoinstall
    self.validate_autoinstall(ai_data)
  File "/home/cpete/software/subiquity/scripts/../subiquity/server/controller.py", line 71, in validate_autoinstall
    raise new_exception from original_exception
subiquity.server.autoinstall.AutoinstallValidationError: Malformed autoinstall in 'error-commands' section
Failure: The provided autoinstall config failed validation

There is perhaps a more elegant way to say "not empty" but this was my first approach.

I have tried to submit the CLA, but I don't know what to fill in for the field: "Please add the Canonical Project Manager or contact". It won't let me submit it without that value. What should I put in there?

I actually don't know to be honest. I suppose since you're contributing to subiquity you could put "subiquity" down, but I'm not sure this field matters for individual contributors.

@penguin359
Copy link
Author

I'll also add in the user experience that lead me to this issue.

While attempting to automate my install process, I had a small autoinstall file with a single entry in late-commands. Something like this:

autoinstall:
  version: 1
  ...
  late-commands:
    - echo "Hello"

The exact command was different. I then made quite a number of changes including commenting out that command with a pound sign, but I did not comment out the key in front of it. After that, I found my installer was crashing and my initial suspicions were related to other changes I had made. It took me some time to realize that the crash was due to this file:

autoinstall:
  version: 1
  ...
  late-commands:
    #- echo "Hello"

@Chris-Peterson444
Copy link
Contributor

Thank you for sharing! This is exactly the kind of scenarios I want to avoid.

It took me some time to realize that the crash was due to this file

Yeah autoinstall errors can be elusive. We've done some work to make the runtime errors more obvious but the easiest ones to catch are ones like this where we can screen for them at validation time.

Currently, an Autoinstall YAML with an empty command-list will cause the
installer to crash or validate-autoinstall-user-data.py to exit without
a clear error message indicating where the issue is or what the correct
resolution might be.
This avoids a crash in both the installer and the validation script when
a command list such as early-commands is provided in the Autoinstall
YAML, but no entries are included. This causes the value of that key to
be None instead of an empty list and triggers an exception in
load_autoinstall_data(). Ignoring that value causes it to behave as if
that command list was not provided at all.
@penguin359
Copy link
Author

I have tried to submit the CLA, but I don't know what to fill in for the field: "Please add the Canonical Project Manager or contact". It won't let me submit it without that value. What should I put in there?

I actually don't know to be honest. I suppose since you're contributing to subiquity you could put "subiquity" down, but I'm not sure this field matters for individual contributors.

For future note, I think that contact comes from this list: https://canonical.com/projects/directory. I have sent Matthieu an email.

@penguin359 penguin359 changed the title Fix handling of empty command lists in Autoinstall YAML Improve handling of empty command lists in Autoinstall YAML Sep 3, 2024
@penguin359
Copy link
Author

If you don't want to merge this until it's changed to a validation error, feel free to change it to a Draft PR.

@Chris-Peterson444
Copy link
Contributor

If you don't want to merge this until it's changed to a validation error, feel free to change it to a Draft PR.

Sure, let's do that then. I think something along the lines I suggested would be the right direction. A change to the jsonschema along with some unittests to verify the behavior would be good. You can use the exact change I suggested, or something different if you prefer.

Some tips: validation happens via a call to a controller's validate_autoinstall method, which is defined here. It's not a perfect example, but this test may be worth a read for an idea of how to setup the test. Although the real tests should land in test_cmdlist.py. If you'd like some more concrete pointers or have any questions, please feel free to ask!

@Chris-Peterson444 Chris-Peterson444 marked this pull request as draft September 3, 2024 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants