-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid sinking a row if this new row is the same as the old row #10362
Comments
It's worth noting that not all records with Thus to handle these general cases, I guess we have to at least buffer records in a single chunk, i.e., do a "compaction" before sinking them to external systems. BTW, I'm currently evaluating whether the
Where I mean "if the user believes the output is append-only but our optimizer does not find this due to the lack of some features, we can still allow the sink to run". Apparently, the stream is not append-only in the given case of this issue, so I agree that we should find another clear way to provide correct semantics of altering, possibly by exposing the raw change-log (with the |
I think the "force-append-only" exactly supports many features requested by the user before we implement “get the changelog from a table”. I think we can rethink that after we introduce that. |
Just try to add some more information extracted from the conversation in the slack,
output multiple times even after count(*) > 1 already and keep increasing, as
output only once even after count(*) > 1 already and keep increasing, as
output only once when |
This issue has been open for 60 days with no activity. If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the You can also confidently close this issue as not planned to keep our backlog clean. |
Take an example:
Right the output in the Kafka is:
But the user wants:
The argument is that in alerting use case, the following two events provide no information and makes further processing even harder: we need to identify that these three
true
events are the same, as we only want to alert the user once.In general, if the entire row is the same as the old row (this is an update op), we expect to sink only once.
Is it possible that there exists a case that the user does want to have multiple same events sunk? I don't know, but I think this can always be achieved by adding the column that is indeed changing, by using the case above:
create sink sk1 as select user, count(*), (case when count(*) > 1 then true else false end) as status from t group by user
wee add another
count(*)
column.Tag this issue as high-priority as it has been requested by two users already, and it makes a lot of sense for alerting use cases, which Risingwave is good at.
The text was updated successfully, but these errors were encountered: