Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xapi failure to start after reboot should cause test failure #213

Open
ydirson opened this issue Mar 8, 2024 · 1 comment
Open

xapi failure to start after reboot should cause test failure #213

ydirson opened this issue Mar 8, 2024 · 1 comment

Comments

@ydirson
Copy link
Contributor

ydirson commented Mar 8, 2024

A test can get apparently-indefinitely stuck like e.g.:

tests/storage/zfs_ng/test_zfs_sr.py::TestZFSSR::test_reboot[x.x.x.x-None-auto] 
-------------------------------------------------------------------------------------------------- live log call --------------------------------------------------------------------------------------------------
Mar 08 15:57:13 INFO Reboot host x.x.x.x
Mar 08 15:57:14 INFO Wait for host down
Mar 08 15:57:14 INFO Wait for host up

Observing the processes during this time we can see the ssh process for some time, as in:

  |   |   `-pytest,1683343 /usr/bin/pytest --hosts=x.x.x.x --sr-disk=auto tests/storage/zfs_ng/test_zfs_sr.py
  |   |       `-sh,1687561 -c ssh root@x.x.x.x -o "BatchMode yes" -o "StrictHostKeyChecking no" -o "LogLevel ERROR" -o "UserKnownHostsFile /dev/null" 'xe host-param-get uuid=71778983-5cc3-4f07-8cb0-ad10674e1e1f param-name=enabled'
  |   |           `-ssh,1687562 root@x.x.x.x -o BatchMode yes -o StrictHostKeyChecking no -o LogLevel ERROR -o UserKnownHostsFile /dev/null xe host-param-get uuid=71778983-5cc3-4f07-8cb0-ad10674e1e1f param-name=enabled

Then the ssh and sh processes disappear, but the test stays unperturbed.

Then when interrupting, the python trace shows both mention of the ^C and of the ssh error, which apparently was caught but is not correctly handled:

-------------------------------------------------------------------------------------------------- live log call --------------------------------------------------------------------------------------------------
Mar 08 15:57:13 INFO Reboot host x.x.x.x
Mar 08 15:57:14 INFO Wait for host down
Mar 08 15:57:14 INFO Wait for host up

^C--------------------------------------------------------------------------------------------- live log sessionfinish ----------------------------------------------------------------------------------------------
Mar 08 16:37:56 INFO << Destroy VM
Mar 08 16:37:57 INFO Will attempt SR destroy on 25271565-e594-5f8c-d4fb-41be38a2429c...
Mar 08 16:37:57 INFO Scan SR 25271565-e594-5f8c-d4fb-41be38a2429c
Mar 08 16:37:57 INFO Restore yum state for host x.x.x.x

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/home/user/src/xcp-ng-tests/lib/commands.py:111: KeyboardInterrupt
(to show a full traceback on KeyboardInterrupt use --full-trace)
Traceback (most recent call last):
  File "/usr/bin/pytest", line 8, in <module>
    sys.exit(console_main())
             ^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/_pytest/config/__init__.py", line 190, in console_main
    code = main()
           ^^^^^^
  File "/usr/lib/python3/dist-packages/_pytest/config/__init__.py", line 167, in main
    ret: Union[ExitCode, int] = config.hook.pytest_cmdline_main(
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pluggy/_hooks.py", line 265, in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pluggy/_manager.py", line 80, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 60, in _multicall
    return outcome.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pluggy/_result.py", line 60, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 39, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/_pytest/main.py", line 317, in pytest_cmdline_main
    return wrap_session(config, _main)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/_pytest/main.py", line 305, in wrap_session
    config.hook.pytest_sessionfinish(
  File "/usr/lib/python3/dist-packages/pluggy/_hooks.py", line 265, in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pluggy/_manager.py", line 80, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 55, in _multicall
    gen.send(outcome)
  File "/usr/lib/python3/dist-packages/_pytest/terminal.py", line 808, in pytest_sessionfinish
    outcome.get_result()
  File "/usr/lib/python3/dist-packages/pluggy/_result.py", line 60, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 39, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 106, in pytest_sessionfinish
    session._setupstate.teardown_exact(None)
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 530, in teardown_exact
    raise exc
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 523, in teardown_exact
    fin()
  File "/usr/lib/python3/dist-packages/_pytest/fixtures.py", line 685, in <lambda>
    subrequest.node.addfinalizer(lambda: fixturedef.finish(request=subrequest))
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/_pytest/fixtures.py", line 1037, in finish
    raise exc
  File "/usr/lib/python3/dist-packages/_pytest/fixtures.py", line 1030, in finish
    func()
  File "/usr/lib/python3/dist-packages/_pytest/fixtures.py", line 917, in _teardown_yield_fixture
    next(it)
  File "/home/user/src/xcp-ng-tests/tests/storage/zfs_ng/conftest.py", line 34, in vm_on_zfs_sr
    vm.destroy(verify=True)
  File "/home/user/src/xcp-ng-tests/lib/vm.py", line 168, in destroy
    if not self.is_halted():
           ^^^^^^^^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/vm.py", line 29, in is_halted
    return self.power_state() == 'halted'
           ^^^^^^^^^^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/vm.py", line 23, in power_state
    return self.param_get('power-state')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/basevm.py", line 18, in param_get
    return _param_get(self.host, BaseVM.xe_prefix, self.uuid, param_name, key, accept_unknown_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/common.py", line 149, in _param_get
    value = host.xe(f'{xe_prefix}-param-get', args)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/host.py", line 87, in xe
    result = self.ssh(
             ^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/host.py", line 57, in ssh
    return commands.ssh(self.hostname_or_ip, cmd, check=check, simple_output=simple_output,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/src/xcp-ng-tests/lib/commands.py", line 150, in ssh
    raise result_or_exc
lib.commands.SSHCommandFailed: SSH command (xe vm-param-get uuid=863580b2-adb6-a73b-b399-dbeee06d675a param-name=power-state) failed with return code 1: Error: Connection refused (calling connect )
@ydirson
Copy link
Contributor Author

ydirson commented Mar 11, 2024

Should have taken the time to analyze the backtrace: it is the test teardown that gets "connection refused", as well as any attempt to rerun the test afterwards - while at the same time I have no issue connecting to the target VM.

"Connection refused" is indeed from xe not ssh.

@ydirson ydirson changed the title ssh failures not properly handled xapi failure to start after reboot should cause test failure Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant