-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] Combination of autocreate + forced delete of partitioned topic with active consumer leaves topic metadata inconsistent. #17308
Conversation
…ith active consumer leaves topic metadata inconsistent.
98e6e79
to
ba6a0b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
I left a comment about the error code. I think we should return something that is more user friendly as an Internal Error seems that the server is broken.
I have another thought, but I am not sure how we can fix it:
How can we help a user that doesn't have control on client applications but needs to delete a partitioned topic?
Maybe this is not a big deal because if you want to have control over the topics you turn off automatic creation at namespace level
try { | ||
admin.topics().deletePartitionedTopic(topic, true); | ||
fail("expected error because partitioned topic has active producer"); | ||
} catch (PulsarAdminException.ServerSideErrorException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ServerSideError is not very user friendly.
I would expect 'Conflict' or the same thing that happens when you do not use the 'force' option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli I return the same TopicBusyException as other cases when topic deletion is not possible.
If it needs to be translated into some other error for pulsar admin/rest API I suggest we create a bug and resolve this in a different PR
pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java
Lines 1129 to 1148 in 34c2262
if (isClosingOrDeleting) { | |
log.warn("[{}] Topic is already being closed or deleted", topic); | |
return FutureUtil.failedFuture(new TopicFencedException("Topic is already fenced")); | |
} else if (failIfHasSubscriptions && !subscriptions.isEmpty()) { | |
return FutureUtil.failedFuture( | |
new TopicBusyException("Topic has subscriptions: " + subscriptions.keys())); | |
} else if (failIfHasBacklogs && hasBacklogs()) { | |
List<String> backlogSubs = | |
subscriptions.values().stream() | |
.filter(sub -> sub.getNumberOfEntriesInBacklog(false) > 0) | |
.map(PersistentSubscription::getName).toList(); | |
return FutureUtil.failedFuture( | |
new TopicBusyException("Topic has subscriptions did not catch up: " + backlogSubs)); | |
} else if (TopicName.get(topic).isPartitioned() | |
&& (getProducers().size() > 0 || getNumberOfConsumers() > 0) | |
&& getBrokerService().isAllowAutoTopicCreation(topic)) { | |
// to avoid inconsistent metadata as a result | |
return FutureUtil.failedFuture( | |
new TopicBusyException("Partitioned topic has active consumers or producers and " | |
+ "auto-creation of topic is allowed")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
(As discussed offline) As a follow up work we should disable automatic topic creation before deleting a namespace, this way we will allow users to delete namespaces and tenants more easily |
@@ -1139,6 +1139,13 @@ private CompletableFuture<Void> delete(boolean failIfHasSubscriptions, | |||
.map(PersistentSubscription::getName).toList(); | |||
return FutureUtil.failedFuture( | |||
new TopicBusyException("Topic has subscriptions did not catch up: " + backlogSubs)); | |||
} else if (TopicName.get(topic).isPartitioned() | |||
&& (getProducers().size() > 0 || getNumberOfConsumers() > 0) | |||
&& getBrokerService().isAllowAutoTopicCreation(topic)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR can not solve every scenario:
The cmd delete topic is executed between these two instructions: consumer lookup
and consumer subscribe
, even if the partitioned topic is deleted successfully, but the client already has the topic-meta(which has been deleted), then the consumer subscribes with the topic name "topic-partition-x". You can reproduce like this:
1. create partitioned topic "tp_test"
2. consumer lookup
3. delete topic
4. consumer subscribe
5."tp_test-partition-x" created
@poorbarcode this is a good catch indeed! Thinking more about this case the "problem" is related to the sequences of things that happen in the two operations:
We must prevent these two things to be performed concurrently Ideally the two operations should be performed in reverse order: |
btw I think that this patch mitigates most of the usecases that happen in real world (and we saw it a lot of them in some production clusters after upgrading to 2.10) |
I also think this change mitigates the problems we've encountered in prod; before implementing this I experimented with different approach (as mentioned in the description of the PR) |
…with active consumer leaves topic metadata inconsistent. (apache#17308) (cherry picked from commit 9529850)
…with active consumer leaves topic metadata inconsistent. (apache#17308) (cherry picked from commit 9529850)
…with active consumer leaves topic metadata inconsistent. (apache#17308)
Hi @eolivelli @dlg99 I think we should revert this PR. System topics are always created automatically, and We should consider other ways to solve this problem |
@poorbarcode you are right, I think that we must revert this patch. |
…d topic with active consumer leaves topic metadata inconsistent. (apache#17308)" This reverts commit 9529850.
…d topic with active consumer leaves topic metadata inconsistent. (apache#17308)" This reverts commit 9524418.
Motivation
Forced delete of partitioned topic with active consumer on the namespace where the topic autocreate is enabled leaves the namespace in the state where one cannot create partitioned topic with the same name (because it exists already) and cannot delete it (because it does not exist at the same time)
Modifications
Don't allow deletion in this case until all consumers/producers disconnected.
I experimented with option to not allow autocreate after deletion of partitioned topics but that ends up in the reasons that led to #14920 + tricky corner cases between metadata updates.
Verifying this change
This change added tests
Does this pull request potentially affect one of the following parts:
If
yes
was chosen, please highlight the changesnothing AFAICT
Documentation
Check the box below or label this PR directly.
Need to update docs?
doc-required
(Your PR needs to update docs and you will update later)
doc-not-needed
Bug fix
doc
(Your PR contains doc changes)
doc-complete
(Docs have been already added)