rbd: add a workaround to fix rbd snapshot scheduling #2647

Madhu-1 · 2021-11-17T14:38:18Z

currently, we have a bug in the rbd mirror scheduling module. After doing failover and failback the scheduling is not getting updated and the mirroring snapshots are not getting created periodically as per the scheduling interval. This PR workaround this one by doing the below operations

Create a dummy (csi-vol-dummy-ceph fsID) image per cluster and this image should be easily identified.
During Promote operation on any image enables the mirroring on the dummy image. when we enable the mirroring on the dummy image the pool will get updated and the scheduling will be reconfigured.
During Demote operation on any image disables the mirroring on the dummy image. the disabled need to be done to enable the mirroring again when we get the promote request to make the image as the primary
When the DR is no more needed, this image needs to be manually cleanup for now as we don't want to add a check
in the existing DeleteVolume code path for deleting dummy images as it impacts the performance of the DeleteVolume workflow.

Moved to add scheduling to the promote operation as scheduling need to be added when the image is promoted and this is the correct method of adding the scheduling to make the scheduling take place.

More details at https://bugzilla.redhat.com/show_bug.cgi?id=2019161
Signed-off-by: Madhu Rajanna madhupr007@gmail.com

Madhu-1 · 2021-11-17T14:39:17Z

Manual testing works for me. For now added DNM as its getting tested by @ShyamsundarR and @BenamarMk

BenamarMk · 2021-11-17T14:42:45Z

internal/rbd/replicationcontrollerserver.go

+		if err != nil {
+			return nil, status.Errorf(codes.Internal, "failed to get mirroring node %s", err.Error())
+		}
+		err = enableMirroringOnDummyImage(rbdVol, mode)


rbdVol is not the dummy volume. We will be enabling mirroring on a none dummy volume if am reading this correctly.

Also, enabling mirroring for the dummy volume should be done after we promote the real volume.

enableMirroringOnDummyImage takes the rbdVol as input and creates a new dummy struct and replaces the name and calls the mirror operation on that dummy image.

@BenamarMk is it the same for disabling mirroring on dummy image also? it should be done after demoting the real image?

fixed for enabling. let me know your input for disable also.

Disable is fine. YOu don't need to do anything to it. So basically the flow is like this:

Promote

enable dummy
And then

Disable dummy

Demote

okay, then the current changes are fine. Thanks for confirming @BenamarMk

Moved to add scheduling to the promote operation as scheduling need to be added when the image is promoted and this is the correct method of adding the scheduling to make the scheduling take place. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

added helper function to get the cluster ID. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

BenamarMk · 2021-11-17T15:27:34Z

internal/rbd/replicationcontrollerserver.go

@@ -437,6 +529,30 @@ func (rs *ReplicationServer) PromoteVolume(ctx context.Context,
 		}
 	}

+	var mode librbd.ImageMirrorMode


Shouldn't the whole block from 532 all the way to 540 be inside the block above? IOW, don't you need to put it at line 529?

The reason for keeping it out of the loop is if the csidriver is restarted after promote and before enabling the mirroring on the dummy image on the next request the if check will be skipped as the image is already primary

BenamarMk

It looks good to me @Madhu-1. Thanks for coding it so quickly. I will now test it.

Madhu-1 · 2021-11-17T15:38:19Z

/retest ci/centos/upgrade-tests-cephfs

Madhu-1 · 2021-11-17T15:38:49Z

/retest ci/centos/upgrade-tests-cephfs

Nov 17 15:36:49.574: FAIL: failed to create user cephcsi-cephfs-node with error etcdserver: request timed out

internal/rbd/replicationcontrollerserver.go

humblec · 2021-11-18T06:26:21Z

One general question here @Madhu-1 , whlie the failback happen and if the dummy image exist, do we foresee any issues?

Madhu-1 · 2021-11-18T06:30:57Z

One general question here @Madhu-1 , whlie the failback happen and if the dummy image exist, do we foresee any issues?

Do you mean the dummy image created by the user? it's a workaround I don't think the user will create any real images with the same name for using it.

humblec · 2021-11-18T06:35:36Z

One general question here @Madhu-1 , whlie the failback happen and if the dummy image exist, do we foresee any issues?

Do you mean the dummy image created by the user? it's a workaround I don't think the user will create any real images with the same name for using it.

not that, a dummy image exist in the cluster and during failover we create another one for second cluster, Now during failback, the first cluster's dummy image exist already or not cleaned up, Thinking that, I was asking the existence of dummy image in the first cluster can be a problem while the failback is established and images started to get promoted/demoted.

Madhu-1 · 2021-11-18T06:44:18Z

One general question here @Madhu-1 , whlie the failback happen and if the dummy image exist, do we foresee any issues?

Do you mean the dummy image created by the user? it's a workaround I don't think the user will create any real images with the same name for using it.

not that, a dummy image exist in the cluster and during failover we create another one for second cluster,

The dummy image is unique per cluster it's created with the fsid as the prefix.

Now during fallback, the first cluster's dummy image exist already or not cleaned up, Thinking that, I was asking the existence of dummy image in the first cluster can be a problem while the failback is established and images started to get promoted/demoted.

i don't see any problem here. during fallback the dummy image mirroring will be disabled we are not going to cleanup or do anything. we operator on the dummy images which is created for that cluster not the dummy image of the other cluster.

humblec · 2021-11-18T07:02:06Z

One general question here @Madhu-1 , whlie the failback happen and if the dummy image exist, do we foresee any issues?

Do you mean the dummy image created by the user? it's a workaround I don't think the user will create any real images with the same name for using it.

not that, a dummy image exist in the cluster and during failover we create another one for second cluster,

The dummy image is unique per cluster it's created with the fsid as the prefix.

Yeah, thats correct, what I meant above was same, may be during failover.. was unwanted.. :)

Now during fallback, the first cluster's dummy image exist already or not cleaned up, Thinking that, I was asking the existence of dummy image in the first cluster can be a problem while the failback is established and images started to get promoted/demoted.

i don't see any problem here. during fallback the dummy image mirroring will be disabled we are not going to cleanup or do anything. we operator on the dummy images which is created for that cluster not the dummy image of the other cluster.

Sure, if we dont expect any problem in this scenario, we are good.

internal/rbd/replicationcontrollerserver.go

humblec · 2021-11-18T07:47:46Z

lgtm.. considering the importance of this PR we have to take this in at the earliest and can do further enhancements later.. @Madhu-1 will we wait for testing result from @BenamarMk @ShyamsundarR before merge ?

CC @ceph/ceph-csi-contributors ptal

Madhu-1 · 2021-11-18T07:50:33Z

lgtm.. considering the importance of this PR we have to take this in at the earliest and can do further enhancements later.. @Madhu-1 will we wait for testing result from @BenamarMk @ShyamsundarR before merge ?

Yes waiting for the testing result. but want to close up on review as changes look good to @BenamarMk and tested locally.

CC @ceph/ceph-csi-contributors ptal

currently we have a bug in rbd mirror scheduling module. After doing failover and failback the scheduling is not getting updated and the mirroring snapshots are not getting created periodically as per the scheduling interval. This PR workarounds this one by doing below operations * Create a dummy (unique) image per cluster and this image should be easily identified. * During Promote operation on any image enable the mirroring on the dummy image. when we enable the mirroring on the dummy image the pool will get updated and the scheduling will be reconfigured. * During Demote operation on any image disable the mirroring on the dummy image. the disable need to be done to enable the mirroring again when we get the promote request to make the image as primary * When the DR is no more needed, this image need to be manually cleanup as for now as we dont want to add a check in the existing DeleteVolume code path for delete dummy image as it impact the performance of the DeleteVolume workflow. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

ShyamsundarR · 2021-11-18T12:53:20Z

internal/rbd/replicationcontrollerserver.go

+	if err != nil {
+		return nil, status.Errorf(codes.Internal, "failed to get mirroring mode %s", err.Error())
+	}
+	err = enableMirroringOnDummyImage(rbdVol, mode)


For the second image that we promote, this enable will become a No-OP? IOW we need to disable/enable mirroring on the dummy image for each promote of a real image, such that the real image gets scheduled.

In the current form the second enable would become a no-OP is what I think. @BenamarMk ?

Yes, correct its a no-op. the mirroring is enabled on dummy image only once for all promote operations. The same goes for disable also

I think I tested that and it worked. I will test it again to see if refresh happens when the dummy image is enabled for mirroring while it is already ENABLED. But if I am wrong, then we can always do this:

Promote PV:

1. disable mirroring Dummy 2. promote PV 3. enable mirroring Dummy

Demote PV:

1. disable mirroring Dummy 2. demote PV

humblec · 2021-11-18T16:29:05Z

lgtm.. considering the importance of this PR we have to take this in at the earliest and can do further enhancements later.. @Madhu-1 will we wait for testing result from @BenamarMk @ShyamsundarR before merge ?

Yes waiting for the testing result. but want to close up on review as changes look good to @BenamarMk and tested locally.

CC @ceph/ceph-csi-contributors ptal

@Madhu-1 it looks like we are still testing this, However I have placed approval to avoid any unnecessary delay once we are done with the testing to merge.

ShyamsundarR

There are a few corner cases... hopefully this prevents the auto-merge

humblec · 2021-11-18T16:30:45Z

There are a few corner cases... hopefully this prevents the auto-merge

@ShyamsundarR there is already DNM, so that should be enough to block the merge.

Madhu-1 · 2021-11-19T03:05:11Z

closing in favor of #2656

Madhu-1 added the DNM DO NOT MERGE label Nov 17, 2021

mergify bot added component/rbd Issues related to RBD bug Something isn't working labels Nov 17, 2021

BenamarMk reviewed Nov 17, 2021

View reviewed changes

Madhu-1 force-pushed the schedule-add-promote branch from aed551f to 6c42988 Compare November 17, 2021 14:50

Madhu-1 added 2 commits November 17, 2021 20:31

util: add helper to get the cluster ID

1ec816a

added helper function to get the cluster ID. Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 force-pushed the schedule-add-promote branch from 6c42988 to be7ddd6 Compare November 17, 2021 15:01

BenamarMk reviewed Nov 17, 2021

View reviewed changes

BenamarMk approved these changes Nov 17, 2021

View reviewed changes

ShyamsundarR reviewed Nov 17, 2021

View reviewed changes

internal/rbd/replicationcontrollerserver.go Show resolved Hide resolved

Madhu-1 added the Priority-0 highest priority issue label Nov 18, 2021

Madhu-1 requested review from a team November 18, 2021 05:30

humblec reviewed Nov 18, 2021

View reviewed changes

internal/rbd/replicationcontrollerserver.go Outdated Show resolved Hide resolved

internal/rbd/replicationcontrollerserver.go Outdated Show resolved Hide resolved

Madhu-1 force-pushed the schedule-add-promote branch from be7ddd6 to d204f08 Compare November 18, 2021 07:04

Madhu-1 requested a review from humblec November 18, 2021 07:04

humblec reviewed Nov 18, 2021

View reviewed changes

internal/rbd/replicationcontrollerserver.go Show resolved Hide resolved

Madhu-1 force-pushed the schedule-add-promote branch from d204f08 to fa92683 Compare November 18, 2021 07:23

Madhu-1 requested a review from humblec November 18, 2021 07:45

Madhu-1 force-pushed the schedule-add-promote branch from fa92683 to dea6da7 Compare November 18, 2021 12:45

ShyamsundarR reviewed Nov 18, 2021

View reviewed changes

humblec approved these changes Nov 18, 2021

View reviewed changes

ShyamsundarR requested changes Nov 18, 2021

View reviewed changes

Madhu-1 closed this Nov 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rbd: add a workaround to fix rbd snapshot scheduling #2647

rbd: add a workaround to fix rbd snapshot scheduling #2647

Madhu-1 commented Nov 17, 2021 •

edited

Loading

Madhu-1 commented Nov 17, 2021

BenamarMk Nov 17, 2021

BenamarMk Nov 17, 2021

Madhu-1 Nov 17, 2021

Madhu-1 Nov 17, 2021

Madhu-1 Nov 17, 2021

BenamarMk Nov 17, 2021

Madhu-1 Nov 17, 2021

BenamarMk Nov 17, 2021 •

edited

Loading

Madhu-1 Nov 17, 2021

BenamarMk left a comment

Madhu-1 commented Nov 17, 2021

Madhu-1 commented Nov 17, 2021

humblec commented Nov 18, 2021

Madhu-1 commented Nov 18, 2021

humblec commented Nov 18, 2021

Madhu-1 commented Nov 18, 2021 •

edited

Loading

humblec commented Nov 18, 2021

humblec commented Nov 18, 2021

Madhu-1 commented Nov 18, 2021

ShyamsundarR Nov 18, 2021

Madhu-1 Nov 18, 2021

BenamarMk Nov 18, 2021

humblec commented Nov 18, 2021

ShyamsundarR left a comment

humblec commented Nov 18, 2021 •

edited

Loading

Madhu-1 commented Nov 19, 2021

rbd: add a workaround to fix rbd snapshot scheduling #2647

rbd: add a workaround to fix rbd snapshot scheduling #2647

Conversation

Madhu-1 commented Nov 17, 2021 • edited Loading

Madhu-1 commented Nov 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenamarMk Nov 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenamarMk left a comment

Choose a reason for hiding this comment

Madhu-1 commented Nov 17, 2021

Madhu-1 commented Nov 17, 2021

humblec commented Nov 18, 2021

Madhu-1 commented Nov 18, 2021

humblec commented Nov 18, 2021

Madhu-1 commented Nov 18, 2021 • edited Loading

humblec commented Nov 18, 2021

humblec commented Nov 18, 2021

Madhu-1 commented Nov 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

humblec commented Nov 18, 2021

ShyamsundarR left a comment

Choose a reason for hiding this comment

humblec commented Nov 18, 2021 • edited Loading

Madhu-1 commented Nov 19, 2021

Madhu-1 commented Nov 17, 2021 •

edited

Loading

BenamarMk Nov 17, 2021 •

edited

Loading

Madhu-1 commented Nov 18, 2021 •

edited

Loading

humblec commented Nov 18, 2021 •

edited

Loading