-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logs more VM state info in updater #523
Conversation
Drat, I neglected to update the mocked calls, which I had plenty of time to do while the updater was running. Will pick this back up tomorrow. |
Have not revisited the mocked tests yet, which need to be updated due to the |
I'll apply this locally once it's rebased. |
In an attempt to understand the specific failures during the various actions the GUI updater runs against VMs: * qvm-start * qvm-shutdown * qvm-kill Let's make sure to capture the stderr of the CLI action. We should switch to using the Python API for most of these actions, at which time we'll have to update the exception handling, but for now this'll give us more visibility into the specific failures in the logs.
In the graphical updater, let's log the state of all running VMs immediately prior to running `qvm-start`. That will help us evaluate we should alter the reboot logic to avoid problems. Really, this should be a "debug" line, not an "info" line, but since the default logging is "info" I've left it as-is. The original request was that we log the output of `xl list`, which has more informative output, but it's inscrutable in log lines so I'm taking a simpler approach that should prove as useful.
2aad118
to
6836eb5
Compare
Rebased. Ready for review. |
running_vms = subprocess.check_output( | ||
["qvm-ls", "--running", "--raw-list"], stderr=subprocess.PIPE | ||
) | ||
sdlog.info("VMs running before start of {}: {}".format(vm, running_vms)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will produce logs like:
2020-04-06 15:12:53,712 - sdw_updater_gui.Updater:463(_safely_start_vm) INFO: VMs running before start of sd-app: b'dom0\nsd-gpg\nsd-log\nsd-proxy\nsd-whonix\nsys-firewall\nsys-net\nsys-usb\nsys-whonix\nwork\n'
I would recommend converting the list of VMs to a comma-separated list for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. New format looks like:
2020-04-06 16:13:37,826 - sdw_updater_gui.Updater:463(_safely_start_vm) INFO: VMs running before start of sd-app: ['dom0', 'fpf-infra', 'fpf-vault', 'fpf-vpn', 'fpf-web', 'sd-dev', 'sd-devices', 'sd-gpg', 'sd-kernel-builder', 'sd-log', 'sd-proxy', 'sd-whonix', 'sys-firewall', 'sys-firewall-inner', 'sys-net', 'sys-usb', 'sys-whonix']
Running with this patch now; so far logs look good, but am unable to reproduce the error reliably at this point. |
The change here: 74ab8c4 inadvertently broke the updater: the removal of the
And now, if you'll allow me a moment to eat my hat, I propose we back out those formatting changes and live with the single-line bytes-formatted string for stability. |
Heh, I'll squint a little bit every time I see one of those lines |
74ab8c4
to
6836eb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @conorsch and @kushaldas , these changes work as expected. Small suggestion inline, but also good to merge as-is to start collecting more information.
@@ -446,19 +447,27 @@ def shutdown_and_start_vms(): | |||
|
|||
def _safely_shutdown_vm(vm): | |||
try: | |||
subprocess.check_call(["qvm-shutdown", "--wait", vm]) | |||
subprocess.check_output(["qvm-shutdown", "--wait", vm], stderr=subprocess.PIPE) | |||
except subprocess.CalledProcessError as e: | |||
sdlog.error("Failed to shut down {}".format(vm)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would adding a log line about information about running VMs here make sense as well? This would help debug situations where a VM can't shutdown due to a VM depending on it being still powered on. In the current workstation case, no VM will be a netVM, but it can be a receiver of qubes-rpc calls, which will cause it to power on again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, would rather have the logs than not. Will add momentarily!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neglected to add more logging info as suggested. Merging now so devs get the more verbose logs pronto, will continue to look for opportunities to expand logging in follow-ups.
…dater Logs more VM state info in updater
Status
Ready for review
Description of Changes
Fixes #521. Refs #498
Changes proposed in this pull request:
qvm-start
commandsqvm-*
commands, although not quite as useful as we might hope, see Improve logging of updater failure cases #521 (comment)For comparison:
Before:
After:
Testing
Confirm you see the more verbose logs, particularly the list of all running VMs.
Checklist
If you have made code changes
make flake8
) passes in the development environment (this box maybe left unchecked, as
flake8
also runs in CI)If you have made changes to the provisioning logic
All tests (
make test
) pass indom0
of a Qubes installThis PR adds/removes files, and includes required updates to the packaging
logic in
MANIFEST.in
andrpm-build/SPECS/securedrop-workstation-dom0-config.spec