Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase: update ceph to reef #4030

Merged
merged 6 commits into from
Aug 25, 2023
Merged

rebase: update ceph to reef #4030

merged 6 commits into from
Aug 25, 2023

Conversation

Madhu-1
Copy link
Collaborator

@Madhu-1 Madhu-1 commented Aug 2, 2023

Testing Reef and seeing everything works.

@Madhu-1 Madhu-1 added DNM DO NOT MERGE component/testing Additional test cases or CI work labels Aug 2, 2023
@mergify mergify bot added the rebase update the version of an external component label Aug 2, 2023
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 2, 2023

/test ci/centos/mini-e2e/k8s-1.27

1 similar comment
@nixpanic
Copy link
Member

nixpanic commented Aug 3, 2023

/test ci/centos/mini-e2e/k8s-1.27

@nixpanic
Copy link
Member

nixpanic commented Aug 3, 2023

The CI job tries to pull this:

Trying to pull registry-ceph-csi.apps.ocp.cloud.ci.centos.org/ceph/ceph:v18...

The mirror contains this image:

going to copy docker://quay.io/ceph/ceph:v18 to docker://registry-ceph-csi.apps.ocp.cloud.ci.centos.org/quay.io/ceph/ceph:v18
Getting image source signatures
Copying blob sha256:66286c37fba202f94836d18ed3ea1efa02816e6db632ec78297328694bed0b94
Copying blob sha256:a20960ea1b02448618d7cdee5e657e6b83404689ca4ce4555251aeada11b5a7a
Copying config sha256:ff9c8d7de4ca37ca64a34818fa8d6e7b4e6be435e4f8f54ed2e00827f8fa3d31
Writing manifest to image destination
Storing signatures

The mirror contains quay.io in the image name, but it seems to have been stripped by the CI job.

PR #4031 should probably not include the quay.io/ part in the 2nd column with "short names".

nixpanic added a commit to nixpanic/ceph-csi that referenced this pull request Aug 3, 2023
Updates: ceph#4030
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this pull request Aug 3, 2023
CI jobs pull the ceph/ceph:v18 image (without `quay.io`). If the
mirroring includes the registry host, the image can not be found.

Updates: ceph#4030
Signed-off-by: Niels de Vos <ndevos@ibm.com>
@nixpanic
Copy link
Member

nixpanic commented Aug 3, 2023

#4032 has been created, the mirroring runs nightly, so after merging the PR, a new /test ... can be done to see if the Ceph Reef image can be used in the CI job.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 4, 2023

/test ci/centos/mini-e2e/k8s-1.27

mergify bot pushed a commit that referenced this pull request Aug 4, 2023
CI jobs pull the ceph/ceph:v18 image (without `quay.io`). If the
mirroring includes the registry host, the image can not be found.

Updates: #4030
Signed-off-by: Niels de Vos <ndevos@ibm.com>
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 7, 2023

/test ci/centos/mini-e2e/k8s-1.27

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 7, 2023

  Aug  7 07:49:45.556: INFO: ExecWithOptions: execute(POST https://192.168.49.2:8443/api/v1/namespaces/rook-ceph/pods/rook-ceph-tools-757999d6c7-c4v5d/exec?command=%2Fbin%2Fsh&command=-c&command=ceph+osd+pool+create+.nfs+128&container=rook-ceph-tools&container=rook-ceph-tools&stderr=true&stdout=true)

  Aug  7 07:49:46.063: INFO: stdErr occurred: pool names beginning with . are not allowed



  Aug  7 07:49:46.068: INFO: ExecWithOptions {Command:[/bin/sh -c ceph osd pool set .nfs size 1 --yes-i-really-mean-it] Namespace:rook-ceph PodName:rook-ceph-tools-757999d6c7-c4v5d ContainerName:rook-ceph-tools Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:true Quiet:false}

  Aug  7 07:49:46.068: INFO: >>> kubeConfig: /root/.kube/config

  Aug  7 07:49:46.068: INFO: ExecWithOptions: Clientset creation

  Aug  7 07:49:46.068: INFO: ExecWithOptions: execute(POST https://192.168.49.2:8443/api/v1/namespaces/rook-ceph/pods/rook-ceph-tools-757999d6c7-c4v5d/exec?command=%2Fbin%2Fsh&command=-c&command=ceph+osd+pool+set+.nfs+size+1+--yes-i-really-mean-it&container=rook-ceph-tools&container=rook-ceph-tools&stderr=true&stdout=true)

  Aug  7 07:49:46.572: INFO: failed to execute command: command terminated with exit code 2

  Aug  7 07:49:46.572: INFO: stdErr occurred: Error ENOENT: unrecognized pool '.nfs'

@nixpanic NFS pool creation failing, anything changed in Ceph Reef version?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 7, 2023

https://docs.ceph.com/en/reef/rados/operations/pools/#pool-names mentions its reserved and we should not use it

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 7, 2023

/test ci/centos/mini-e2e/k8s-1.27

@nixpanic
Copy link
Member

nixpanic commented Aug 7, 2023

@nixpanic NFS pool creation failing, anything changed in Ceph Reef version?

It seems like Reef has some changes with respect to pool creation... Rook usually creates the .nfs pool, I think. But it requires redundancy. I do not know how Rook creates the .nfs pool in Reef, maybe we need to pass an extra option to the ceph osd command?

https://github.com/rook/rook/blob/7aaf46e11243b21df4d4f088fc80b949019559b0/pkg/operator/ceph/nfs/nfs.go#L319

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 7, 2023

@nixpanic NFS pool creation failing, anything changed in Ceph Reef version?

It seems like Reef has some changes with respect to pool creation... Rook usually creates the .nfs pool, I think. But it requires redundancy. I do not know how Rook creates the .nfs pool in Reef, maybe we need to pass an extra option to the ceph osd command?

https://github.com/rook/rook/blob/7aaf46e11243b21df4d4f088fc80b949019559b0/pkg/operator/ceph/nfs/nfs.go#L319

@nixpanic Yes you are correct we were missing extra flag to create pool in cephcsi e2e, added missing flag now.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 7, 2023

/test ci/centos/mini-e2e/k8s-1.27

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 8, 2023

#7 45.98   Running scriptlet: nfs-ganesha-5.5-1.el8s.x86_64                        11/12 
#7 45.99 error: nfs-ganesha-selinux-5.4-1.el8s.noarch: erase skipped
#7 45.99 /var/tmp/rpm-tmp.FjPzqW: line 2: /etc/selinux/config: No such file or directory
#7 45.99 warning: %posttrans(nfs-ganesha-5.5-1.el8s.x86_64) scriptlet failed, exit status 1
#7 45.99 
#7 45.99 Error in POSTTRANS scriptlet in rpm package nfs-ganesha
#7 45.99   Running scriptlet: nfs-ganesha-5.4-1.el8s.x86_64                        11/12 
#7 46.11   Verifying        : nfs-ganesha-5.5-1.el8s.x86_64                         1/12 
#7 46.11   Verifying        : nfs-ganesha-5.4-1.el8s.x86_64                         2/12 
#7 46.11   Verifying        : nfs-ganesha-ceph-5.5-1.el8s.x86_64                    3/12 
#7 46.11   Verifying        : nfs-ganesha-ceph-5.4-1.el8s.x86_64                    4/12 
#7 46.11   Verifying        : nfs-ganesha-rados-grace-5.5-1.el8s.x86_64             5/12 
#7 46.11   Verifying        : nfs-ganesha-rados-grace-5.4-1.el8s.x86_64             6/12 
#7 46.11   Verifying        : nfs-ganesha-rados-urls-5.5-1.el8s.x86_64              7/12 
#7 46.11   Verifying        : nfs-ganesha-rados-urls-5.4-1.el8s.x86_64              8/12 
#7 46.11   Verifying        : nfs-ganesha-rgw-5.5-1.el8s.x86_64                     9/12 
#7 46.11   Verifying        : nfs-ganesha-rgw-5.4-1.el8s.x86_64                    10/12 
#7 46.11   Verifying        : nfs-ganesha-selinux-5.5-1.el8s.noarch                11/12 
#7 46.11   Verifying        : nfs-ganesha-selinux-5.4-1.el8s.noarch                12/12Error: Transaction failed
#7 46.51  
#7 46.51 
#7 46.51 Upgraded:
#7 46.51   nfs-ganesha-5.5-1.el8s.x86_64                                                 
#7 46.51   nfs-ganesha-ceph-5.5-1.el8s.x86_64                                            
#7 46.51   nfs-ganesha-rados-grace-5.5-1.el8s.x86_64                                     
#7 46.51   nfs-ganesha-rados-urls-5.5-1.el8s.x86_64                                      
#7 46.51   nfs-ganesha-rgw-5.5-1.el8s.x86_64                                             
#7 46.51 Failed:
#7 46.51   nfs-ganesha-selinux-5.4-1.el8s.noarch  nfs-ganesha-selinux-5.5-1.el8s.noarch 
#7 46.51 
#7 ERROR: process "/bin/sh -c dnf -y update        && dnf clean all        && rm -rf /var/cache/yum" did not complete successfully: exit code: 1
------
 > [updated_base 3/3] RUN dnf -y update        && dnf clean all        && rm -rf /var/cache/yum:
46.51 
46.51 Upgraded:
46.51   nfs-ganesha-5.5-1.el8s.x86_64                                                 
46.51   nfs-ganesha-ceph-5.5-1.el8s.x86_64                                            
46.51   nfs-ganesha-rados-grace-5.5-1.el8s.x86_64                                     
46.51   nfs-ganesha-rados-urls-5.5-1.el8s.x86_64                                      
46.51   nfs-ganesha-rgw-5.5-1.el8s.x86_64                                             
46.51 Failed:
46.51   nfs-ganesha-selinux-5.4-1.el8s.noarch  nfs-ganesha-selinux-5.5-1.el8s.noarch 
46.51 

it looks like ceph v18 image is broken today and not able to build cephcsi image like yesterday 🤕

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 15, 2023

/test ci/centos/mini-e2e/k8s-1.27

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 21, 2023

/test ci/centos/mini-e2e/k8s-1.27

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 21, 2023

/test ci/centos/mini-e2e/k8s-1.27

@Madhu-1 Madhu-1 requested review from a team August 21, 2023 13:23
@nixpanic
Copy link
Member

Both CI jobs failed with:

  [FAILED] failed to resize filesystem PVC: context deadline exceeded
  In [It] at: /go/src/github.com/ceph/ceph-csi/e2e/rbd.go:852 @ 08/23/23 08:11:07.519

This isn't a test that I have seen sporadically fail, so it may point to a change related to the Ceph version.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 23, 2023

Both CI jobs failed with:

  [FAILED] failed to resize filesystem PVC: context deadline exceeded
  In [It] at: /go/src/github.com/ceph/ceph-csi/e2e/rbd.go:852 @ 08/23/23 08:11:07.519

This isn't a test that I have seen sporadically fail, so it may point to a change related to the Ceph version.

CI passed on 1.25,1.26. and 1.27 not sure is it something to do with ceph version, let me check on that one

@nixpanic
Copy link
Member

CI passed on 1.25,1.26. and 1.27 not sure is it something to do with ceph version, let me check on that one

Other PRs seem to fail in the same way with RBD resizing on Kubernetes 1.28. It is not only this PR 😢

@nixpanic
Copy link
Member

@Mergifyio rebase

Updating ceph base image to Reef as its
the latest ceph release.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
update Rook to the latest v12 release.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
in recent ceph version .nfs pool creation
is failing, as we are sure about creating the
pools in the e2e tests, tring to create the pool
with required extra agruments to make it successful.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
disable ganesha as dnf update is
failing on Reef ceph version.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
As we dont need to enable nfs modules
in ceph v16.2.8 onwards skipping this one.

Because of this one we have a regression
in nfs export

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
specify from what ceph versions we
need to run nfs service commands and
when to skip it.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
@mergify
Copy link
Contributor

mergify bot commented Aug 25, 2023

rebase

✅ Branch has been successfully rebased

@nixpanic nixpanic added the ok-to-test Label to trigger E2E tests label Aug 25, 2023
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.26

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.26

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.26

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.28

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Aug 25, 2023
@nixpanic
Copy link
Member

/test ci/centos/mini-e2e/k8s-1.25

@nixpanic
Copy link
Member

/test ci/centos/mini-e2e-helm/k8s-1.25

@nixpanic
Copy link
Member

/test ci/centos/k8s-e2e-external-storage/1.25

@mergify mergify bot merged commit 461effe into ceph:devel Aug 25, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/testing Additional test cases or CI work rebase update the version of an external component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants