Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Controller Design #29

Open
1 task done
iGxnon opened this issue Aug 24, 2023 · 2 comments
Open
1 task done

[Feature]: Controller Design #29

iGxnon opened this issue Aug 24, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@iGxnon
Copy link
Collaborator

iGxnon commented Aug 24, 2023

Description about the feature

Backgroud

In operator-k8s, we need to develop a controller for reconciling the XlineCluster resources. Currently, the main implementation version coordinates the xline pods using a built-in k8s controller called StatefulSet. It manages the creation, deletion, and rebuilding of Pods, allocates stable network identifiers to pods, and binds persistent volume to pods.

Given the upcoming version updates, it might be necessary to make certain changes to the design of the Controller.

Issue

In most cases, StatefulSet as the controller for stateful services meets the requirements effectively.
However, when it comes to stateful services like xline that rely on relationships between nodes, certain behaviors of StatefulSet might not be the most optimal.

For instance, when scale down a cluster, the StatefulSet will initiate deletions starting from the one with the highest identifier number, without taking into account whether the leader exists within the nodes being removed. In this scenario, removing non-leader nodes is beneficial for enhancing the cluster's availability (This issue can be alleviated through leadership transfer extension (Section 3.10), but not completely resolved).

Screenshot 2023-08-24 at 18 49 26

Solution

etcd-operator implemented a dedicated controller that maintains certain states in memory. This implementation doesn't appear to be good enough at the moment because once the operator crashes, the state will vanish.

Creating a controller similar to StatefulSet is quite challenging. Fortunately, there's a well-developed solution AdvancedStatefulSet available to address the issues mentioned earlier.

Details

We offer two implementations of the StatefulSet similar to the risingwave-operator. One is the built-in StatefulSet, and the other is the AdvancedStatefulSet by OpenKruise.

Given that AdvancedStatefulSet compats with the fields of StatefulSet, certain codes can be reused (the construction of components in StatefulSet).

Scale Up

The operator sends a scale request to the StatefulSet. Each newly started node's sidecar is responsible for adding this node to the existing cluster using membership change before starting the node (If it's the first initialized node, then there's no need for membership change).

Scale Down

In the heartbeat of the sidecar, the leader needs to be marked. If AdvancedStatefulSet is used, the operator will retain the leader node and delete the reduced number of follower nodes. If StatefulSet is used, deletion will commence from the last numbered pod.

When deleting a pod, a termination signal will be sent to the sidecar inside the pod. The sidecar needs to capture this signal to perform cleanup tasks: it will send a membership change request to the cluster to remove its own node and ultimately be deleted by the controller.

Membership Clean Task

If a sidecar crashes upon termination or due to network issues, the cluster should also proactively remove this node.

Therefore, we should introduce a membership clean task in each sidecar to accomplish this task. We will discuss the finer details of this design in the upcoming PR.

Links

Code of Conduct

  • I agree to follow this project's Code of Conduct
@iGxnon iGxnon added the enhancement New feature or request label Aug 24, 2023
@liangyuanpeng
Copy link
Contributor

Here is the client-rust for openkruise : https://github.com/openkruise/client-rust , release of v0.1.0 will be soon as ,work with v1.0.0 of kruise api.

@iGxnon
Copy link
Collaborator Author

iGxnon commented Sep 15, 2023

Here is the client-rust for openkruise : https://github.com/openkruise/client-rust , release of v0.1.0 will be soon as ,work with v1.0.0 of kruise api.

Thx, That would be awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants