-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature][KafkaSource]Add customize the row separator. #4494
Conversation
…f data, you can customize the line separator to split
please approve ci check. |
@Hisoka-X @TyrantLucifer please approve ci check. |
@Hisoka-X @TyrantLucifer @hailin0 PTAL. |
1 similar comment
@Hisoka-X @TyrantLucifer @hailin0 PTAL. |
please approve ci , thanks. |
# Conflicts: # seatunnel-connectors-v2/connector-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSourceReader.java
…seatunnel into kafka-row-delimiter � Conflicts: � seatunnel-connectors-v2/connector-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSourceReader.java
please approve ci. |
@Hisoka-X @EricJoy2048 @TyrantLucifer please approve ci. |
@Hisoka-X @EricJoy2048 @TyrantLucifer please approve ci. |
# Conflicts: # docs/en/connector-v2/source/kafka.md
Sorry for late response. Let me check now! |
@@ -150,8 +160,27 @@ public void pollNext(Collector<SeaTunnelRow> output) throws Exception { | |||
recordList) { | |||
|
|||
try { | |||
deserializationSchema.deserialize( | |||
record.value(), output); | |||
if (StringUtils.isBlank(rowDelimiter)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the message in json contains row_delimiter
value like \n
:
{"key":"value",
"key2":"value2"
}
The split will produce two wrong message which can't convert to normal json string.
So I believe the feature only work normally when format are text
.
Why not put this feature into TextDeserializationSchema
? So that other connector can get this feature too. cc @TyrantLucifer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, the premise is that the content of the message cannot contain newline symbols, otherwise there will be wrong parsing, in fact, the same problem will occur in the text format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also believe it's best to add this feature to TextDeserializationSchema.
Purpose of this pull request
Add customize the row separator, if a message contains multiple row of data, you can customize the line separator to split.
Config:
Test screenshot
Check list
New License Guide
release-note
.