-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Regression] Decommission is broken #542
Comments
the PVC gets deleted as well so there is no way to recover from this state. |
@keith-mcclellan so we keep the PVC, and we delete them only if the decommission command it is successfull?
|
Decommission should run as follows functionally:
Decommission tests: Positive case 1 - Positive case 2 - Negative case 1 - Negative case 2 - |
Is the command run on the node itself?
This should be a unit-test, I think we just need to test the
This should also be a unit-test |
You can, but it's got an interface so you can run it from anywhere that you have the database binary. |
This is an example of a decommissioned node that is ready to be stopped - |
@udnay Please document what the correct workflow is for decommision. We are getting differing opinions. |
Weighing in on behalf of @udnay and at the request of @alinadonisa. The logic in From what I can tell, that logic simply isn't running to completion or isn't running at all. Does anyone have the logs available of a failed decommission? The decommissioner is very verbose, it should be pretty easy to tell where something is going wrong based on the logs. The PVC pruner will only remove the volumes of pods that are not currently running and have an ordinal less than the number of desired replicas. It sounds like something is changing the desired number of replicas out side of the call to It does everything that Keith has suggested sans the under replicated system check but that could be easily plugged into |
Looking at the logs attached I see
I see a failure in the decomission code due to maybe port forwarding or something because the operator isn't running in cluster Then we start to run the What does reconciler do? Can it be shutting down the extra pod after decommsion failed? |
After this the logs show decommission failing because not all replicas are up, I believe @chrisseto is probably correct or at least onto something. |
Seems like the error handling for failed decommissioning is busted? Though I'm not sure why the decommission command would fail... |
I'm not questioning that the cc drainer works properly, I'm questioning whether we implemented it properly. Something is stopping the pod before the decommission is complete...see
I think the problem is the opposite to @chrisseto - I think another actor is running and that other actor is stopping the pod because it sees that nodes = 3 in the CR instead of nodes =4 and thinks it should shut it down. But because decommission is running, the decommission fails and the decommission never gets started back up. Which is then having the node showing up as failed. If I'm reading this right, it's this that's stopping the pod that we're waiting to decommission:
Decommission should be a blocking operation - ie the operator should not to any other work until the decommission is complete. And if the decommission fails we shouldn't allow the PVC pruner to run. |
@udnay this needs manual testing, but removing release blocker |
Any updates about it? I'm still faced the same issue in v22.1.2 |
@prafull01 @himanshu-cockroach can one of you take a look? |
I will take a look |
The statefulset is stopping the pod before
cockroach node decommission
is being executed, so the node is showing up as failed instead of as decommissioned.The PVC also gets deleted so there is no way to recover from this state if the scale down caused a loss of quorum because the data gets destroyed.
Steps to reproduce:
create 4 node cluster
change node count to 3 nodes
decommission-regression.log
The text was updated successfully, but these errors were encountered: