Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error messaging #105

Closed
jbaublitz opened this issue Dec 7, 2020 · 9 comments · Fixed by #157
Closed

Improve error messaging #105

jbaublitz opened this issue Dec 7, 2020 · 9 comments · Fixed by #157
Assignees
Milestone

Comments

@jbaublitz
Copy link
Owner

Created due to #103

@phip1611 I'm envisioning the following for the SerError and DeError structs which generated the error that was confusing for you. Because a lot of these errors are autogenerated in the serialize and deserialize macros, how does this sound? SerError and DeError will include the type of the object being serialized or deserialized, an optional struct field if it's a struct being serialized or deserialized, and what the cause of the error was. So an error in the case of your example would look something like this:

Deserializing field error of struct Nlmsgerr could not be completed: the end of the deserialization buffer was reached before the deserialization operation could be completed; the packet received may be incomplete

Here I'm assuming that you included no payload whatsoever after the netlink header so deserialization would have failed on the deserialization of Nlmsgerr here on this line.

Would that error message be clearer?

@jbaublitz jbaublitz self-assigned this Dec 7, 2020
@jbaublitz jbaublitz added this to the 0.5.2 milestone Dec 7, 2020
@phip1611
Copy link
Contributor

phip1611 commented Dec 7, 2020

SerError and DeError will include the type of the object being serialized or deserialized, an optional struct field if it's a struct being serialized or deserialized, and what the cause of the error was.

This sounds perfectly right. This will help to either identify problems in a custom netlink implementation (when writing a custom kernel module) or perhaps to find some other neli-bugs. @jbaublitz

@jbaublitz
Copy link
Owner Author

Okay great! I'll put up a PR once I finished the design for you to look over.

@jbaublitz
Copy link
Owner Author

I'm still working on this as is requires more information to be passed to the macros for error messages. Because I consider macros part of the API, I will need to make some rather invasive changes to accomplish that.

@phip1611
Copy link
Contributor

Don't worry, take your time. Happy holidays/merry christmas and thanks for your effort! 🎅🎄 @jbaublitz

@phip1611
Copy link
Contributor

phip1611 commented Jan 4, 2021

As mentioned here #116: It would be super useful if neli could also tell at which offset it failed to deserialize a message or give more details about the bytes that could be deserialized and the ones that could not be deserialized. But I imagine this would be hard to implement with the current architecture, right?

For example nlmsg_len is 6 bytes but the struct nlmsg_error/the attribute ... has a length of ... bytes

Current state of the error message in neli when deserialization fails:
Deserializing type Nlmsghdr could not be completed: the deserialization operation completed before all of the data in the buffer could be consumed; the packet being deserialized may be of the wrong type or an incorrectly sized deserialization buffer may have been provided to the deserialization operation

But why? Information what bytes are missing or what bytes are surplus would be extremely helpful. Not only to discover possible bugs in custom netlink families but also in neli.

@jbaublitz
Copy link
Owner Author

I'm going to keep working on this but follow along with #101. So far, it's looking promising and will allow much better logging in this regard for debugging purposes.

@jbaublitz jbaublitz modified the milestones: 0.5.3, 0.5.4 Feb 21, 2021
@jbaublitz
Copy link
Owner Author

@phip1611 Is the improved logging sufficient for debugging?

@phip1611
Copy link
Contributor

phip1611 commented Nov 9, 2021

@jbaublitz hm, any idea how I can easily provoke a parsing error, so that I can verify?

PS: I played a little bit around with the latest neli version. Looks good and error handling (kernel replies with error) works fine too, as expected.

@jbaublitz
Copy link
Owner Author

@phip1611 One way I can think of pretty easily is to serialize an Nlmsghdr and then artificially increase nl_len in the binary representation and deserialize the buffer. You want trace logging enabled.

My main thoughts on the logging is that it may be able to be improved and if you have feedback on what you'd like to see, I'll immediately put out a patch release with logging for, for example, the part of the slice that caused the error. I didn't know how much of the slice people would want to see, etc. so I'd really like to get more concrete feedback from the people who want this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants