Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Partitioner.RequiresConsistency vary per-message #1112

Merged
merged 1 commit into from
Jun 14, 2018

Conversation

evan-stripe
Copy link
Contributor

This is a little bit speculative, but it was about as easy to write the patch as it was to write the proposal, so this seemed easier. I don't have any particular commitment to any aspect of the implementation, so I'm interested in hearing feedback.


One of the useful features of the random partitioner is that it will only partition messages onto partitions that are currently online. In theory, hash partitioner is capable of exhibiting the same behavior for messages where the key is unset, but it can't because RequiresConsistency is a constant for the partitioner. It's also not feasible to add methods to the Partitioner interface, since that's part of the public API.

Therefore, this introduces a new interface, ExtendedPartitioner, which includes an additional method: MessageRequiresConsistency. This lets the partitioner make a per-message determination for whether consistency is required (and therefore whether it's acceptable to publish to partitions which are offline). The producer library will use this method if it's available, and fall back on the existing RequiresConsistency method otherwise.

Since hash partitioners are the only ones which have different consistency requirements per-message, for now they are the only partitioner to implement MessageRequiresConsistency.

@eapache
Copy link
Contributor

eapache commented Jun 14, 2018

🤔 other than the name ExtendedPartitioner (I'd prefer something more descriptive) there isn't a lot here I can disagree with. I guess I'm surprised you run into unavailable partitions often enough that this feature is useful?

This may be overkill, but while I'm thinking about partitioners from this angle, maybe the HashPartitioner should just take a fallback Partitioner rather than defaulting to random, in case somebody wanted it to fall back to round-robin? But if nobody's asked for it in this long it's probably not worth building.

@evan-stripe
Copy link
Contributor Author

I'm happy to change the name, although I'm having a hard time coming up with anything that is both descriptive and reasonably concise - maybe VariableConsistencyPartitioner?

In terms of running into this problem, we did some pretty extensive gamedaying during our initial Kafka rollout, and tripped over https://issues.apache.org/jira/browse/KAFKA-4358. At the time, we switched to the RandomPartitioner to avoid that bug, but we're finding that limiting, because we can't do partitioned publishes.

@eapache
Copy link
Contributor

eapache commented Jun 14, 2018

DynamicConsistencyPartitioner?

One of the useful features of the random partitioner is that it will
only partition messages onto partitions that are currently online. In
theory, hash partitioner is capable of exhibiting the same behavior for
messages where the key is unset, but it can't because
RequiresConsistency is a constant for the partitioner. It's also not
feasible to add methods to the Partitioner interface, since that's part
of the public API.

Therefore, this introduces a new interface, ExtendedPartitioner, which
includes an additional method: MessageRequiresConsistency. This lets the
partitioner make a per-message determination for whether consistency is
required (and therefore whether it's acceptable to publish to partitions
which are offline). The producer library will use this method if it's
available, and fall back on the existing RequiresConsistency method
otherwise.

Since hash partitioners are the only ones which have different
consistency requirements per-message, for now they are the only
partitioner to implement MessageRequiresConsistency.
@evan-stripe evan-stripe force-pushed the extended-partitioner branch from 9250df4 to 9b2c4d6 Compare June 14, 2018 16:57
@evan-stripe
Copy link
Contributor Author

Sure, I'm fine with that - just pushed a rename.

@eapache
Copy link
Contributor

eapache commented Jun 14, 2018

Thanks!

@eapache eapache merged commit bc4e274 into IBM:master Jun 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants