Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E constantly failing on kube 1.28 while resizing RBD PVC #4073

Closed
Madhu-1 opened this issue Aug 23, 2023 · 9 comments
Closed

E2E constantly failing on kube 1.28 while resizing RBD PVC #4073

Madhu-1 opened this issue Aug 23, 2023 · 9 comments
Labels
bug Something isn't working component/rbd Issues related to RBD component/testing Additional test cases or CI work dependency/k8s depends on Kubernetes features

Comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 23, 2023

https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/rest/organizations/jenkins/pipelines/mini-e2e_k8s-1.28/runs/16/nodes/194/steps/197/log/?start=0 is one instance of it

@nixpanic nixpanic added bug Something isn't working component/rbd Issues related to RBD dependency/k8s depends on Kubernetes features component/testing Additional test cases or CI work labels Aug 24, 2023
@nixpanic
Copy link
Member

At least #4030 #4063 and #4064 seem to be affected.

@nixpanic nixpanic changed the title E2E constantly failing on kube 1.28 E2E constantly failing on kube 1.28 while resizing RBD PVC Aug 24, 2023
@riya-singhal31
Copy link
Contributor

This too, #4058

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 24, 2023

Events:
  Type     Reason                  Age                 From                     Message
  ----     ------                  ----                ----                     -------
  Normal   Scheduled               30m                 default-scheduler        Successfully assigned rbd-2207/csi-rbd-demo-pod to minikube
  Normal   SuccessfulAttachVolume  30m                 attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-9f89c82b-f705-47cf-bb92-534f152767b6"
  Normal   Pulled                  30m                 kubelet                  Container image "docker.io/library/nginx:latest" already present on machine
  Normal   Created                 30m                 kubelet                  Created container web-server
  Normal   Started                 30m                 kubelet                  Started container web-server
  Warning  VolumeResizeFailed      48s (x22 over 29m)  kubelet                  NodeExpandVolume.NodeExpandVolume failed to resize volume for volume "pvc-9f89c82b-f705-47cf-bb92-534f152767b6" : volume resizing failed for unknown reason

I see above in the logs but am not able to reproduce it locally.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 24, 2023

@riya-singhal31 are you able to reproduce this locally?

@nixpanic
Copy link
Member

The following problem is listed in this CI job:

  I0824 09:12:00.225800       1 controller.go:255] Starting external resizer rbd.csi.ceph.com
  W0824 09:12:23.832840       1 warnings.go:70] unknown field "status.resizeStatus"
  I0824 09:12:23.833266       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rbd-4560", Name:"raw-block-pvc", UID:"efb590b1-42c5-42d4-b87a-be08e6e6a1be", APIVersion:"v1", ResourceVersion:"2311", FieldPath:""}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-efb590b1-42c5-42d4-b87a-be08e6e6a1be
  W0824 09:12:23.899938       1 warnings.go:70] unknown field "status.resizeStatus"
  I0824 09:12:23.900263       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rbd-4560", Name:"raw-block-pvc", UID:"efb590b1-42c5-42d4-b87a-be08e6e6a1be", APIVersion:"v1", ResourceVersion:"2311", FieldPath:""}): type: 'Normal' reason: 'VolumeResizeSuccessful' Resize volume succeeded
  W0824 09:13:06.088539       1 warnings.go:70] unknown field "status.resizeStatus"
  I0824 09:13:06.088842       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rbd-4560", Name:"rbd-pvc", UID:"1db73c11-99fc-4c24-86ef-1d7a01df4e58", APIVersion:"v1", ResourceVersion:"2452", FieldPath:""}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-1db73c11-99fc-4c24-86ef-1d7a01df4e58
  W0824 09:13:06.144136       1 warnings.go:70] unknown field "status.resizeStatus"
  I0824 09:13:06.144384       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rbd-4560", Name:"rbd-pvc", UID:"1db73c11-99fc-4c24-86ef-1d7a01df4e58", APIVersion:"v1", ResourceVersion:"2452", FieldPath:""}): type: 'Normal' reason: 'FileSystemResizeRequired' Require file system resize of volume on node
  W0824 09:13:06.151026       1 warnings.go:70] unknown field "status.resizeStatus"
  I0824 09:13:06.151383       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rbd-4560", Name:"rbd-pvc", UID:"1db73c11-99fc-4c24-86ef-1d7a01df4e58", APIVersion:"v1", ResourceVersion:"2457", FieldPath:""}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-1db73c11-99fc-4c24-86ef-1d7a01df4e58
  W0824 09:13:06.291292       1 warnings.go:70] unknown field "status.resizeStatus"

Which is being retried with the same failure over and over again.

unknown field "status.resizeStatus" is a concerning thing, maybe the csi-resizer sidecar can not inform Kubernetes about it the right status anymore?

@nixpanic
Copy link
Member

Might be related to kubernetes-csi/external-resizer#270 ?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 24, 2023

@nixpanic i updated #4074 to use resizer canary image, lets see how it goes

@riya-singhal31
Copy link
Contributor

@riya-singhal31 are you able to reproduce this locally?

I was out for some work, doing it now.

@riya-singhal31
Copy link
Contributor

@nixpanic i updated #4074 to use resizer canary image, lets see how it goes

@Madhu-1 should we close this now?

@Madhu-1 Madhu-1 closed this as completed Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/rbd Issues related to RBD component/testing Additional test cases or CI work dependency/k8s depends on Kubernetes features
Projects
None yet
Development

No branches or pull requests

3 participants