Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Support producing kafka records with timestamp. #7372

Closed
liurenjie1024 opened this issue Jan 13, 2023 · 4 comments · Fixed by #7699
Closed

test: Support producing kafka records with timestamp. #7372

liurenjie1024 opened this issue Jan 13, 2023 · 4 comments · Fixed by #7699
Assignees

Comments

@liurenjie1024
Copy link
Contributor

    > > Generally LGTM, please add e2e tests to this.

There is a problem: how do we add e2e test for this... We can't infer the _rw_kafka_timestamp.

Yes, it's a little difficult. We can attach a timestamp to each kafka record, but as far as I know, no tool can support this. We need to use kafka producer api for this.

https://kafka.apache.org/23/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html

Originally posted by @liurenjie1024 in #7275 (comment)

@liurenjie1024
Copy link
Contributor Author

cc @tabVersion

@ZENOTME
Copy link
Contributor

ZENOTME commented Jan 14, 2023

I think we can do this in simulation.

In simulation, we read record from test_data file and send it to the kafka producer.
https://github.com/risingwavelabs/risingwave/blob/main/src/tests/simulation/src/kafka.rs

So we can

  1. modify the test_data file format to contain a timestamp for each record
    e.g.
origin test_data format:
{"v1":1,"v2":"name0"}

new test_data format:
{"v1":1,"v2":"name0"},timestamp:2022-01-01 12:00:00
  1. parse and construct the record

@liurenjie1024
Copy link
Contributor Author

In simulation, we read record from test_data file and send it to the kafka producer.

After a second thought, I still prefer to do it in normal kafka producer, so that we can test more cases in normal e2e

@liurenjie1024
Copy link
Contributor Author

liurenjie1024 commented Jan 30, 2023

After discussion, we don't need to verify the exact kafka timestamp, but use some functions, for example:

select _rw_kafka_timestamp > 10000  from xxx

cc @ZENOTME

This way we don't need to modify current test process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants