Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic cluster state export upon test timeout #5437

Closed
fjetter opened this issue Oct 18, 2021 · 1 comment
Closed

Automatic cluster state export upon test timeout #5437

fjetter opened this issue Oct 18, 2021 · 1 comment
Assignees
Labels
enhancement Improve existing functionality or make things work better flaky test Intermittent failures on CI.

Comments

@fjetter
Copy link
Member

fjetter commented Oct 18, 2021

We are frequently bothered by unit tests timing out. Most of the timeout are not easily reproducible and are hard to debug. In the past, these timeouts could often be linked to a known issue, deadlocking the cluster. Often this deadlock can be investigated by inspecting the cluster state.

For a manual extraction of the cluster state, I once wrote #5068 which is trying to create a clean, serializable representation of the cluster.

Upon test timeout, this cluster dump could be persisted and archived as an artifact of the GH actions runner.

For instance, if gen_cluster fixture is used with a client, a timeout exception could be handled and the state is persisted, see

try:
future = func(*args, *outer_args, **kwargs)
future = asyncio.wait_for(future, timeout)
result = await future
if s.validate:
s.validate_state()
finally:
if client and c.status not in ("closing", "closed"):
await c._close(fast=s.status == Status.closed)
await end_cluster(s, workers)
await asyncio.wait_for(cleanup_global_workers(), 1)

@fjetter fjetter added enhancement Improve existing functionality or make things work better flaky test Intermittent failures on CI. labels Oct 18, 2021
@fjetter fjetter self-assigned this Oct 26, 2021
@jrbourbeau
Copy link
Member

Closed via #5470

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improve existing functionality or make things work better flaky test Intermittent failures on CI.
Projects
None yet
Development

No branches or pull requests

2 participants