Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify protocol for reporting execution errors #24

Closed
wants to merge 1 commit into from

Conversation

takluyver
Copy link
Member

There are currently two ways for kernels to report an error during execution. Many frontends and kernels have code to use both, to deal with the other end having only used one way.

This proposal is to simplify how errors are reported by deprecating some parts of the message protocol, and then removing them in protocol version 6.

Copy link
Contributor

@ellisonbg ellisonbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a sense of how many/which kernels are using the things that will be deprecated?

@ivanov
Copy link
Member

ivanov commented Oct 17, 2017

We need to proceed with caution here. It seems to me that without an error message type sent over IOPub, there is no reliable (machine readable) way for clients that did not send off code for execution to know that an error occurred (since only the client that started the execution will get the execute_reply). I agree that the situation could be improved here, but I'm not sure this is better. Switching on message types simplifies logic in clients, whereas now we'll need to parse display message on clients to get error status.

@takluyver
Copy link
Member Author

Do we have a sense of how many/which kernels are using the things that will be deprecated?

My guess is that most of them are - see the motivation section. At least IPython, IRkernel and bash_kernel send both error indications. With careful use of protocol versioning, this shouldn't be an obstacle.

without an error message type sent over IOPub, there is no reliable (machine readable) way for clients that did not send off code for execution to know that an error occurred

I thought about this, but I don't know of any actual use cases where a client that didn't send off the code needs to know that there was an error. I'm guessing you have some in mind?

Eliminating the duplication is what I'm after - I can design that the other way, by only reporting errors in IOPub messages, but I think that way would involve more disruptive changes to the protocol.

@blink1073
Copy link
Contributor

cc @dsblank. We are currently using the error message type for errors emitted from the %%python magic in metakernels.

@dsblank
Copy link
Member

dsblank commented Oct 17, 2017

We have a few kernels written in metakernel that do it one way, or the other. It would be good to clean this up.

@ellisonbg
Copy link
Contributor

@ivanov makes a good point about the IOPub message being useful for other clients.

I do think it makes sense to have a single visual representation of the error and that probably makes sense to be on the IOPub channel so other clients can represent it to the user. Are there reasons to not put it on that channel? Would that include the traceback info that is currently returned in the execute_reply message?

If the visual representation is on the IOPub message, I think it does make sense to remove the traceback info from the execute_reply message.

@takluyver
Copy link
Member Author

In this proposal, the visual representation would be a standard display_data message containing the traceback, separate from the machine-readable indication that an error occurred.

I'm happy to modify it so that the visual representation is also the machine-readable indication, but I'd like to see a concrete use case for other clients knowing that an error has occurred first. In all the cases I know of where a client is listening to execution it didn't request, it just displays input and output.

@ellisonbg
Copy link
Contributor

ellisonbg commented Oct 17, 2017 via email

@takluyver
Copy link
Member Author

If we're going to require metadata indicating that it's an error, I'd like to remove the status field from execute_reply; otherwise we're back to having two redundant signals of an error, which is exactly the problem I'm trying to solve.

I think doing it that way is a more disruptive change - that's why I'm trying to establish if there's a real use case that necessitates it.

@ivanov
Copy link
Member

ivanov commented Oct 18, 2017

I think hijacking display data messages for the purposes of error status reporting is the wrong direction. An error message type makes it simple to reason about errors. We have an implementation where we're machine reading the error messages to provide some UI functionality. In short, for a certain error type, we show a specific UI workflow to overcome that error.

I can see how the "status" field of execute reply can be seen as redundant - since it's possible to infer wether the execution was OK or produced an error from the other keys (ename, evalue, and traceback), - though the way the messaging spec is written now, those keys "should" be there on error, not that they "must" be there. The presence of the "status" field just makes it easy to reason about the end result.

I don't like having drift in the specifics of our implementation that doesn't buy us anything - I think the cost of changing the protocol for this redundancy is greater than the benefit it would result in, in this case.

@rgbkrk
Copy link
Member

rgbkrk commented Oct 18, 2017

I completely agree with Paul that the cost of changing the protocol for this redundancy is much greater than the benefit. Handling this with asynchronous frontends is pretty simple, since we can assume either will occur on the stream of messages. Adapting messages is quite a bit more work, as is getting all kernels to adapt to a major bump.

@ellisonbg
Copy link
Contributor

ellisonbg commented Oct 18, 2017 via email

@ivanov
Copy link
Member

ivanov commented Oct 18, 2017

@ellisonbg - no, we currently do not, but I would be in favor of extending the protocol along that dimension.

@ellisonbg
Copy link
Contributor

ellisonbg commented Oct 18, 2017 via email

@Zsailer
Copy link
Member

Zsailer commented Mar 4, 2024

Hi @takluyver 👋—Zach from the @jupyter/software-steering-council here.

We're working through old JEPs and closing proposals that are no longer active or may not be relevant anymore. Under Jupyter's new governance model, we have an active Software Steering Council who reviews JEPs weekly. We are catching up on the backlog now. Since there has been no active discussion on this JEP in awhile, I'd propose we close it here (we'll leave it open for two more weeks in case you'd like to revive the conversation). If you would like to re-open the discussion after we close it, you are welcome to do that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants