-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client method to dump cluster state #5470
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together @fjetter. Overall this looks like a nice addition
distributed/core.py
Outdated
def to_dict( | ||
self, comm: Comm = None, *, exclude: Container[str] = None | ||
) -> dict[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want users directly interacting with this (and other) to_dict
methods? If not, I'd prefer to prepend a leading underscore to the method name. My sense is Client.dump_cluster_state
is the main user-facing entrypoint for this feature
Would including the output of |
Now it works |
Added the version, good point. I think the only topic left to address is whether or not we prefix Friendly ping @jrbourbeau if that's ok for you |
Thanks for all the updates @fjetter. For the sake of trying to be more intentional about our public API, I'd prefer to use leading underscores. I pushed a small commit to make the |
This adds a client method to dump the entire cluster state in a file for debugging purposes. This has been incredibly handy for the deadlock scenarios I've been debugging recently.
This method is called automatically in case a test is running into a timeout and persists the content as part of a GH artefact. This should help us debug spurious, flaky test failures.
I implemented the test dump as a yaml for readability but for real world examples, yaml is not well suited. In my experience these dumps can grow several
MBGB and a feasible approach, so far, was to use msgpack + gzip. That's all not set in stone.The implementation is not very elegant but I added a
to_dict
method, similar but more verbose toidentity
to most relevant classes. If somebody has an idea about a more elegant approach, I'm all earsExample output