-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRAFT: Dead letter implementations #244
Conversation
fb10fea
to
524bb76
Compare
|
||
// TODO: bring in Lists and Maps classes library to get rid of SuppressWarning annotation | ||
@SuppressWarnings("RegexpSingleline") | ||
public class ExceptionHandlingTransform implements Transformation<SinkRecord> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very basic implementation just to demonstrate what I'm thinking of in terms of using an exception handling transform to produce dead-letter-table records directly.
We will probably want all the bells and whistles @tabmatfournier implemented in his PR: #233
524bb76
to
13a6337
Compare
@tabmatfournier the other PR just FYI |
TBH you are still doing this, you're just changing the shape of what the special record is. The connector has to know about dead-letter-records-from-converter during failure in order to dig out the original bytes. |
Not this simple.
Headers are probably the best bet but least nice to work with. This is why I went for the custom struct approach. Either way you are writing some custom code that knows how to dig this out of the record, and putting that into the connector. In my approach this was determined by a config option --not using dead letter/error transform? nothing goes looking for this (no runtime cost). |
The connector code doesn't need to know.
Let's definitely not use schema parameters then lol (that's why I had it last in the list)
Yea absolutely, but I thought the idea with the ExceptionHandlingTransform was it would be the one and only SMT involved (it can run user supplied SMTs internally). So there would be no downstream (other than Even with your approach, if another SMT is involved downstream of the ExceptionHandlingTransform, it could break things, no?
I would prefer key for that reason personally.
This approach should also have no runtime cost if you're not using a WriteExceptionHandler. |
I'm hesitant to override the key. We are now providing a lot of customization downstream, people may be digging information out of the key. I don't think we can override/replace what is there. |
It might be a little hard to see
WriteExceptionHandler
is all we really need to add to fit all the use cases.This PR should demonstrate how you should be able to fit the most common usecases within the
WriteExceptionHandler
framework.I want to write out the original byte array to a dead-letter-table
In theory (i.e. NOT definitively proven yet) ...
For the usecase of writing out the original Kafka key/value byte arrays, we have to do a little bit of work:
- My idea here would be to write a Converter/SMT which preserves the original key/value byte arrays somewhere.
- I'm thinking in the SinkRecord's key or the SinkRecord's headers or SinkRecords schemas parameters
- This SinkRecord is passed to the
SinkTask.put
method- When an exception is thrown, our
WriteExceptionHandler.handle
method would pull out the key/value byte arrays from the SinkRecord's parameters/headers/key and return a new SinkRecord with those values- The connector would then write that new SinkRecord to the dead-letter table
Importantly, what I'm trying to avoid is having special record structure that the connector code needs to understand in order to do the right thing. The ideal situation for me is the connector is oblivious to "good" and "dead-letter-records-from-Converter/SMT-failures."
I want to write out the original JSON to the dead-letter-table
Same approach as above
I want to capture deserialization/SMT failures somehere!
Two options:
I want to capture null records somewhere!
Two options:
I don't want to write bad records to Iceberg, I want to write to Kafka!
See KafkaDLQ for a sample implementation.