Guarantee correct ordering across partitions #18

ari-e · 2023-10-30T18:07:25Z

Previously due to calls to RocksetSinkTask::put() just registering async calls to Rockset, certain combinations of configured parallelism, topic partition count, and occurrence of retries could lead to events for a partition being delivered out of order to Rockset's API.

This solves that and as an added bonus simplifies the connector logic and pushes the responsibility of retries up to Kafka Connect instead of handling it internally in the sink task.

Also took the opportunity to remove some deprecated configuration options and bump the version to a new major version– 2.0.0.

kwadhwa18 · 2023-10-30T20:42:34Z

src/main/java/rockset/RocksetSinkTask.java

+            futures.values().forEach(g -> g.cancel(true));
+            if (isRetriableException(e)) {
+              if (context != null) {
+                context.timeout(config.getRetryBackoffMs());


should this be set once during initialization? I guess changing this config requires restart anyways?

This has to be set every time since Kafka Connect resets it (otherwise you would just retry forever from this point onwards) https://github.com/apache/kafka/blob/9b468fb278701be836a2641650356907bf84860a/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L330-L334

ah got it - thanks for the code pointer!

kwadhwa18 · 2023-10-30T20:45:37Z

src/main/java/rockset/RocksetSinkTask.java

+            f.get();
+          } catch (Exception e) {
+            // Stop all tasks still executing since this put() will be retried anyway
+            futures.values().forEach(g -> g.cancel(true));


for my own understanding - does cancelling cause other tasks to throw ConnectException even if the current future throws RetriableException? Does that halt from sending more requests?

This cancels the thread which will throw an InterruptedException, but that's not handled anywhere. We also don't process the futures for any of the other writes once we see one of them has failed, so even if they did throw an exception it wouldn't matter. We're doing here is propagating the first error that we see. So if there are a combination of RetriableExceptions and ConnectExceptions across all the futures, just the first one will win.

can we add a unit test to cover such scenario?

kwadhwa18

looks great! A unit test for failure scenario (if not already exists) will be good to add

ari-e changed the base branch from master to ari_cleanup October 30, 2023 18:10

ari-e force-pushed the ari_cleanup branch from bac8756 to 6525cc2 Compare October 30, 2023 18:45

ari-e force-pushed the ari_partition-ordering branch 2 times, most recently from db5bafd to e0afd4c Compare October 30, 2023 18:56

ari-e changed the base branch from ari_cleanup to master October 30, 2023 19:01

ari-e force-pushed the ari_partition-ordering branch from e0afd4c to 542019f Compare October 30, 2023 19:52

kwadhwa18 reviewed Oct 30, 2023

View reviewed changes

ari-e requested a review from kwadhwa18 October 30, 2023 21:52

kwadhwa18 approved these changes Oct 30, 2023

View reviewed changes

ari-e force-pushed the ari_partition-ordering branch 8 times, most recently from e448d7e to 38bcb06 Compare October 31, 2023 16:43

Guarantee in order delivery for partitions

723bae7

ari-e force-pushed the ari_partition-ordering branch from 38bcb06 to 723bae7 Compare October 31, 2023 18:11

ari-e merged commit 24c1be0 into master Oct 31, 2023
1 check passed

ari-e deleted the ari_partition-ordering branch October 31, 2023 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guarantee correct ordering across partitions #18

Guarantee correct ordering across partitions #18

ari-e commented Oct 30, 2023 •

edited

Loading

kwadhwa18 Oct 30, 2023

ari-e Oct 30, 2023

kwadhwa18 Oct 30, 2023

kwadhwa18 Oct 30, 2023

ari-e Oct 30, 2023

kwadhwa18 Oct 30, 2023

kwadhwa18 left a comment

Guarantee correct ordering across partitions #18

Guarantee correct ordering across partitions #18

Conversation

ari-e commented Oct 30, 2023 • edited Loading

kwadhwa18 Oct 30, 2023

Choose a reason for hiding this comment

ari-e Oct 30, 2023

Choose a reason for hiding this comment

kwadhwa18 Oct 30, 2023

Choose a reason for hiding this comment

kwadhwa18 Oct 30, 2023

Choose a reason for hiding this comment

ari-e Oct 30, 2023

Choose a reason for hiding this comment

kwadhwa18 Oct 30, 2023

Choose a reason for hiding this comment

kwadhwa18 left a comment

Choose a reason for hiding this comment

ari-e commented Oct 30, 2023 •

edited

Loading