-
Notifications
You must be signed in to change notification settings - Fork 24
Unexpected volume behaviors #11
Comments
Please check the logs from the provisioner Assuming you found errors, we should try and figure out why the bricks are offline. @kshlm may be able to help here. |
@JohnStrunk yes, i am able to see below error repeatedly from csi-proviosioner logs. E0925 12:45:35.500551 1 controller.go:1174] Error scheduling operaion "delete-pvc-0f903cb5bfe111e8[1ce7fda0-bfe1-11e8-aa4f-525400018951]": Failed to create operation with name "delete-pvc-0f903cb5bfe111e8[1ce7fda0-bfe1-11e8-aa4f-525400018951]". An operation with that name failed at 2018-09-25 12:43:35.586449494 +0000 UTC m=+10564.643238377. No retries permitted until 2018-09-25 12:45:37.586449494 +0000 UTC m=+10686.643238377 (2m2s). Last error: "rpc error: code = Internal desc = failed to stop volume node 00b3b28e-f945-4f54-8b28-5d3af879716c is probably down". After some time started giving different repetitively. 0925 12:46:05.501060 1 controller.go:685] Exceeded failedDeleteThreshold threshold: 15, for volume "pvc-0f903cb5bfe111e8", provisioner will not attempt retries for this volume From Gluster-provisioner logs able to see below error: E0925 12:28:36.557480 1 utils.go:100] GRPC error: rpc error: code = Internal desc = failed to stop volume node 00b3b28e-f945-4f54-8b28-5d3af879716c is probably down By seeing above errors, what i have observed is that, Its trying to stop the volume from this container 00b3b28e-f945-4f54-8b28-5d3af879716c which is already deleted. |
@rmadaka can you paste the I feel after PVC creation somewhere glusterd2 container got restart
I think due to the container restart newly created gd2 container is getting new IP and adding itself to the peer list with new IP address, and as keep-alive of previously running container will get expired and the node will be marked as
|
|
due to gd2 container restart Host IP got changed as you can compare |
I had assumed we were still using So now we need to figure out the solution to the changing IP addresses. I'm wondering if we fix the UUID problem in #10 whether gd2 will update the IP addresses. @Madhu-1 any idea? |
Cluster operations may work if IP changes and UUID remains same. But Glusterd2 should re-generate client Volfiles(and also Cluster Volfiles) and notify clients about the IP change.
@amarts @atinmu will client reconfigure client understands |
yes, it would allow the clients to get the new option, and reconfigure! Good to test it before saying it is ready! |
but what happens for already mounted volumes [old volumes will be mounted with old IP's (which is already changed and not reachable) ] a possible temporary solution to fix this problem is
|
|
Continuing on my from my comment in #10, we can easily get the peer-id persisted across restarts of the gd2 pods. So we'd need to ensure that we correctly regenerate volfiles and notify clients of the changes. I'm thinking of an alternate approach to possibly solve this. The problems being faced now is because GD2 is using IP addresses of the pods and because they could change. If instead we could set up fixed hostnames, and have GD2 use the hostnames instead, we could avoid this problem. But this would require our deployment strategy to change. Our current GD2 deployment uses DaemonSets (on all nodes currently, but we could easily select nodes on which to run when required). DaemonSets ensure that a configured pod is running on each selected node. But we do not have persistent hostnames, as the pods are launched with a different name on each restart. We can't also setup Services to point to individual DaemonSet pods, as the selector would match any DaemonSet pod. What if, instead of the a DaemonSet, we deploy individual ReplicaSet/Deployment/StatefulSet for GD2 for each GCS node. The ReplicaSet/Deployment/Stateful set can be pinned to a specific node, to run just one copy of the GD2 pod. A Service can be created for each ReplicaSet/Deployment/StatefulSet, so we get a consistent hostname, which would not change on pod restarts. GD2 can be configured to use this hostname, and along with persistence of the peer-id, we'd technically not need to do anything in GD2 to handle restarts. We could pretty easily do this with the current ansible based deployment. We'd need to create a kube manifest that would create the aforementioned Service and ReplicaSet/Deployment/Stateful, which can be converted into a template, that Ansible could delpoy for specific nodes. Also, a note to everyone. The current Ansible based deployment isn't our end goal. The end goal for GCS is to use the Anthill operator to deploy GD2 and CSI drivers. This Ansible delpoyment, helps us try out different deployment strategies, and will help us make the right choices for Anthill. |
Got it. Yeah it is a problem. But client can reconnect to other glusterd if backup volfile servers are configured. So reconfigure may work unless all glusterd2 IPs are changed. |
I think this can be handled as long as the CSI driver uses the It sounds like the CSI change + ensuring volfile gets regenerated would make us robust to pod IP change. Correct? |
@kshlm |
Using single-replica StatefulSets for each glusterd2 pod instead of a single DaemonSet, allows setting up of and use of pre-known hostnames as the listen address for glusterd2. The StatefulSets are pinned to individual nodes. Also, the glusterd2 pods are now deployed with 'emptyDir' volumes for /var/lib/glusterd2 which allows persistence of peerid. With the above 2 changes, glusterd2 pods survive pod restarts. Fixes gluster#10, gluster#11 Signed-off-by: Kaushal M <kshlmster@gmail.com>
Using single-replica StatefulSets for each glusterd2 pod instead of a single DaemonSet, allows setting up of and use of pre-known hostnames as the listen address for glusterd2. The StatefulSets are pinned to individual nodes. Also, the glusterd2 pods are now deployed with 'emptyDir' volumes for /var/lib/glusterd2 which allows persistence of peerid. With the above 2 changes, glusterd2 pods survive pod restarts. Fixes gluster#10, gluster#11 Signed-off-by: Kaushal M <kshlmster@gmail.com>
Using single-replica StatefulSets for each glusterd2 pod instead of a single DaemonSet, allows setting up of and use of pre-known hostnames as the listen address for glusterd2. The StatefulSets are pinned to individual nodes. Also, the glusterd2 pods are now deployed with 'emptyDir' volumes for /var/lib/glusterd2 which allows persistence of peerid. With the above 2 changes, glusterd2 pods survive pod restarts. Fixes gluster#10, gluster#11 Signed-off-by: Kaushal M <kshlmster@gmail.com>
Using single-replica StatefulSets for each glusterd2 pod instead of a single DaemonSet, allows setting up of and use of pre-known hostnames as the listen address for glusterd2. The StatefulSets are pinned to individual nodes. Also, the glusterd2 pods are now deployed with 'emptyDir' volumes for /var/lib/glusterd2 which allows persistence of peerid. With the above 2 changes, glusterd2 pods survive pod restarts. Fixes gluster#10, gluster#11 Signed-off-by: Kaushal M <kshlmster@gmail.com>
Using single-replica StatefulSets for each glusterd2 pod instead of a single DaemonSet, allows setting up of and use of pre-known hostnames as the listen address for glusterd2. The StatefulSets are pinned to individual nodes. Also, the glusterd2 pods are now deployed with 'hostPath' volumes for /var/lib/glusterd2 which allows persistence of peerid. With the above 2 changes, glusterd2 pods survive pod restarts. Fixes gluster#10, gluster#11 Signed-off-by: Kaushal M <kshlmster@gmail.com>
Using single-replica StatefulSets for each glusterd2 pod instead of a single DaemonSet, allows setting up of and use of pre-known hostnames as the listen address for glusterd2. The StatefulSets are pinned to individual nodes. Also, the glusterd2 pods are now deployed with 'hostPath' volumes for /var/lib/glusterd2 which allows persistence of peerid. With the above 2 changes, glusterd2 pods survive pod restarts. Fixes gluster#10, gluster#11 Signed-off-by: Kaushal M <kshlmster@gmail.com>
@rmadaka this issue is fixed in the latest build. can you verify this one |
Tested scenario with latest build
After 10 mints around we are not able to use this setup becaue of this issue #15 |
I think this issue has been solved, with remaining etcd problem tracked in #15. Please re-open if I am mistaken. |
@JohnStrunk @Madhu-1 Once pvc is created, if i reboot any of the gd2 pod, Rebooted gd2 pod brick status going to offline state. Is it because this issue #15 . Rebooted gd2 pod bricks are not coming back to online when etcd pods are in running state as well |
@rmadaka glusterd2 issue for this one is closed now. can you verify this one? |
-> Create two PVCs (PVC1, PVC2)
-> Mounted two PVCs to one app pod and ran I/O's on mount point.
-> Again above two PVCs mounted to 3 replica controller app pods and ran I/O's on both the mount points.
-> Deleted one replica controller app pod, rc app pod came up automatically with same mount points and no data loss found.
-> Then tried delete PVC1 which is in mounted state, PVC1 status went to terminating state.
-> Now deleted all app pods. then PVC1 deleted successfully.
-> After that one of my worker node went some bad state don't know the reason, which all pods are placed on that worker node went to below state.
-> Then i have rebooted the worker node, now node is up with proper condition, and which all pods are placed on this worker node came to running state.
-> Logged in to one of the gd2 pod and verified existing volume status. all bricks are in offline state.
-> Now verified PVC status
-> Then deleted PVC successfully.
-> Verified pv status, PV not deleted.
-> Again logged into gd2 container, to verify volume exist or not .
-> volume is listing and volume state is STARTED state like below
-> Now verified volume status, volume status is showing with bricks offline.
Here i am observing two things:
The text was updated successfully, but these errors were encountered: