-
-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Devhelp - LZ4 decoding errors on master #152
Comments
The version that is on master enables Kafka v0.11 APIs by default, which means that if the server supports the new API, it will use the new RecordBatch API/data structure. We want to release the new version this week, so I have to update the snappy library as well. I can take a look at LZ4. The new API is quite solid; we used to have a flag when it was experimental. The flag will still be around for new kafka({... allowExperimentalV011: false }) original issue #52 |
@tulios Cool, I'll follow changes on kafkajs-snappy and port them to kafkajs-lz4. |
@paambaati I've tried I wrote this test on KafkajS (https://gist.github.com/tulios/cfba2d2c3b8112bcfb98dcf12789cfde) which works just fine. |
Version |
@tulios This is strange, but here's what works and what doesn't —
I'm connecting to a Kafka 1.1.0 cluster, and to a topic that has LZ4 compressed messages from a Scala-based producer. |
Ok, I will try to reproduce your scenario to see if I can find something. What is interesting is that it's failing on the LZ4 library, I think the problem is similar to one I had with snappy in the past where Kafka were using some sort of frame, so I had to glue the frames together before decompressing. My test messages are quite small, I will try larger payloads and see what happens. Thanks again and bear with me, we will fix this :D |
@tulios Thanks! Meanwhile, I tried comparing logs from both runs, and the only difference I can notice from the logs is the
|
I believe Kafka v0.11 is using frames, I will try to adapt the LZ4 library. https://github.com/lz4/lz4/blob/master/doc/lz4_Frame_format.md This is probably why it works on v0.11 but not on v1.0.0 (I haven't tested on v1.0.0, but everything is pointing to it) https://cwiki.apache.org/confluence/display/KAFKA/KIP-57+-+Interoperable+LZ4+Framing |
@tulios Does this mean we'll have to surface the protocol version to codecs, or should the |
I think you can support both formats; the lib can detect if the payload is a frame. Similar to what I do for the snappy library. const isFrameFormat = buffer => buffer.slice(0, 8).equals(XERIAL_HEADER)
//...
if (!isFrameFormat(buffer)) {
return snappyDecompress(buffer)
}
//... https://github.com/tulios/kafkajs-snappy/blob/master/src/index.js I will be able to give you some help soon but feel free to start working on this. |
@tulios I've been trying to get sample data to compare with both Message Payload
Output The output contains 2 files that are both raw binary dumps of the The interesting thing here is without the experimental protocol, the output is only (and always) 873 bytes. With it turned on, the size fluctuates a bit around 870 KB. Opening up the |
you get the full record batch on v0.11, so it's expected. |
@tulios Apologies for asking a lot of questions, but I noticed that
My assumption is that compression/decompression would be done for each record in the record batch; however, I notice that |
Don't worry about the questions, make as many as you want/need. Give me some time to review this, but for the RecordBatch the records are all compressed together, instead of one by one. |
@paambaati , I finally managed to reproduce the error. I had to use a scala producer to get the Just keeping you posted. |
Good news, I fixed the problem. There is a bug in the RecordBatch decoder, I'll create a PR with the fix, and I'll ping you once it's merged. |
I found the issue with #179, the tail of the record batches can have incomplete records due to how |
On current master, after the recent set of changes, decoding LZ4 messages with the
kafkajs-lz4
package seem to break, with this stack trace —This does not occur on the last tagged release, so I'd appreciate it if you could clue me in on where this could possibly break.
The text was updated successfully, but these errors were encountered: