Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: get lastsyncbytes and lastsycduration for volrep #3894

Merged
merged 1 commit into from
Jun 23, 2023

Conversation

yati1998
Copy link
Contributor

@yati1998 yati1998 commented Jun 9, 2023

This commit get more information from the description like lastsyncbytes and lastsyncduration and send them as a response of getvolumereplicationinfo request.

@yati1998 yati1998 requested a review from nixpanic June 9, 2023 10:47
@mergify mergify bot added the component/rbd Issues related to RBD label Jun 9, 2023
go.mod Outdated Show resolved Hide resolved
@yati1998 yati1998 force-pushed the volrepinfo branch 2 times, most recently from 05392be to 02eea42 Compare June 12, 2023 12:17
@yati1998 yati1998 marked this pull request as ready for review June 12, 2023 12:17
@yati1998 yati1998 requested a review from nixpanic June 12, 2023 12:17
@yati1998
Copy link
Contributor Author

cc @ShyamsundarR

Copy link
Contributor

@Rakshith-R Rakshith-R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this end to end in a live cluster ?

Can you please provide the new CR yaml with status and logs of each container?

Please test it also with ceph cluster version that does not have this change(if these variables were added recently)?

Comment on lines +767 to +803
// description = `replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":81920.0,
// "last_snapshot_bytes":81920,"last_snapshot_sync_seconds":0,
// "local_snapshot_timestamp":1684675261,
// "remote_snapshot_timestamp":1684675261,"replay_state":"idle"}`
// In case there is no last snapshot bytes returns 0 as the
// LastSyncBytes is optional.
// In case there is no last snapshot sync seconds, it returns nil as the
// LastSyncDuration is optional.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was lastSync[Byte|duration] added recently to ceph ?
if so, please add ceph tracker/ version from which this was added for reference in comment.

have you tested this call on a ceph cluster version that does not provide these information ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above query was not answered.

The localStatus struct which is used for unmarshalling will give default 0 value when the field is not present ?
Check this with unit test cases.

We will need to handle missing values.
The current code workflow does not do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the 0 value of lastSyncBytes states undefined files, which has been mentioned in csi-addons. and if the value is 0 the lastsyncbytes in volumereplication operator will not be updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar logic needs to be applied for lastSyncDuration below too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thats done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My very first question in this thread is still unanswered.

Still does not handle the case where duration does not exist in the description.

Current workflow will return 0s which is misleading IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what is mentioned in the comments, o/nil indicates the value of this field is undefined. We can't return any error as it's optional, and in case duration is not there it will be set to nil, 0 in case of lastsyncbytes. These values are sent by ceph-csi and in csi-addons we are keeping it nil only, which will be updates to the status.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal/csi-addons/rbd/replication.go Outdated Show resolved Hide resolved
internal/csi-addons/rbd/replication.go Outdated Show resolved Hide resolved
@Rakshith-R Rakshith-R linked an issue Jun 16, 2023 that may be closed by this pull request
@yati1998 yati1998 force-pushed the volrepinfo branch 3 times, most recently from d11899d to 5f41c9d Compare June 19, 2023 19:58
}
type localStatus struct {
LocalSnapshotTime int64 `json:"local_snapshot_timestamp"`
LocalSnapshotTime int64 `json:"local_snapshot_timestamp"`
LocalSnapshotBytes int64 `json:"last_snapshot_bytes"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the naming of these members are not matching the json contents. Note the difference in Local and last_. Please rename them to match the json, it is a little confusing otherwise.

@yati1998 yati1998 force-pushed the volrepinfo branch 2 times, most recently from ea09ae9 to 0df93de Compare June 20, 2023 10:48
@yati1998
Copy link
Contributor Author

Have you tested this end to end in a live cluster ?
testing is stuck still de to some image issue, but it shouldn't be any issue, as this follows the same flow as lastsynctime. But Still working on to get working examples and logs.
Meanwhile, feel free to review the PR if any other changes are required.

Comment on lines +767 to +803
// description = `replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":81920.0,
// "last_snapshot_bytes":81920,"last_snapshot_sync_seconds":0,
// "local_snapshot_timestamp":1684675261,
// "remote_snapshot_timestamp":1684675261,"replay_state":"idle"}`
// In case there is no last snapshot bytes returns 0 as the
// LastSyncBytes is optional.
// In case there is no last snapshot sync seconds, it returns nil as the
// LastSyncDuration is optional.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above query was not answered.

The localStatus struct which is used for unmarshalling will give default 0 value when the field is not present ?
Check this with unit test cases.

We will need to handle missing values.
The current code workflow does not do that.

@yati1998 yati1998 force-pushed the volrepinfo branch 2 times, most recently from 757ca93 to 80f64dc Compare June 21, 2023 08:39
// This function gets the local snapshot time, last sync snapshot seconds
// and last sync bytes from the description of localStatus and convert
// it into required types.
func getLastSyncInfo(description string) (*timestamppb.Timestamp, *durationpb.Duration, int64, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a local struct and return it in response, this avoids returning many values

@yati1998 yati1998 force-pushed the volrepinfo branch 4 times, most recently from f41a361 to ed6af0f Compare June 21, 2023 19:23
@yati1998 yati1998 force-pushed the volrepinfo branch 2 times, most recently from 5755a46 to 55bb11f Compare June 23, 2023 04:14
@Rakshith-R Rakshith-R added ci/skip/e2e skip running e2e CI jobs ci/skip/multi-arch-build skip building on multiple architectures labels Jun 23, 2023
@yati1998 yati1998 force-pushed the volrepinfo branch 3 times, most recently from 9ef3f00 to 3c6dd32 Compare June 23, 2023 07:00
This commit get more information from the description
like lastsyncbytes and lastsyncduration and send them
as a response of getvolumereplicationinfo request.

Signed-off-by: Yati Padia <ypadia@redhat.com>
@Rakshith-R Rakshith-R requested review from nixpanic and a team June 23, 2023 07:30
@Rakshith-R
Copy link
Contributor

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jun 23, 2023

rebase

✅ Nothing to do for rebase action

@nixpanic nixpanic added the ok-to-test Label to trigger E2E tests label Jun 23, 2023
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.25

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.26

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.25

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.26

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.25

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.26

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Jun 23, 2023
@mergify mergify bot merged commit 2e2e904 into ceph:devel Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/skip/e2e skip running e2e CI jobs ci/skip/multi-arch-build skip building on multiple architectures component/rbd Issues related to RBD
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update getVolumeReplicationInfo to reflect more metrics/info
5 participants