-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docdb] PITR: Implement delete_schedule and edit_schedule APIs #8417
Comments
Summary: This diff adds the ability to delete snapshot schedules. There is a tricky part in detecting deleted schedules on the TServer side. Suppose at some point TSTabletManager does not know schedule1. Since it knows schedules received by heartbeat there are 2 possible options: 1) It is a new schedule that was not yet received by heartbeat. 2) It is deleted schedule. So to distinguish between those 2 cases it uses the following logic. Each time the heartbeat response is received and the schedules list is updated, we also increment the `snapshot_schedules_version_` field. All missing schedules are added to the special map, along with the current value of `snapshot_schedules_version_`. So when we again find the schedule as missing, we could compare the current `snapshot_schedules_version_` and version that we had when the schedule was first found as missing. So if the master does not know this schedule also it means that it is an old schedule that was deleted. But the following could happen: Heartbeat processed by the master, but response not yet processed by tserver. The new schedule is created and sent to the tablet. Then the tablet receives the response to this heart, and it would not contain such a schedule. To avoid interpreting such schedule as deleted we wait that `snapshot_schedules_version_` to be incremented twice, before marking schedule as deleted at tserver. Test Plan: ybd --gtest_filter YbAdminSnapshotScheduleTest.Delete Reviewers: skedia, bogdan Reviewed By: bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11868
Summary: This diff adds the ability to delete snapshot schedules. There is a tricky part in detecting deleted schedules on the TServer side. Suppose at some point TSTabletManager does not know schedule1. Since it knows schedules received by heartbeat there are 2 possible options: 1) It is a new schedule that was not yet received by heartbeat. 2) It is deleted schedule. So to distinguish between those 2 cases it uses the following logic. Each time the heartbeat response is received and the schedules list is updated, we also increment the `snapshot_schedules_version_` field. All missing schedules are added to the special map, along with the current value of `snapshot_schedules_version_`. So when we again find the schedule as missing, we could compare the current `snapshot_schedules_version_` and version that we had when the schedule was first found as missing. So if the master does not know this schedule also it means that it is an old schedule that was deleted. But the following could happen: Heartbeat processed by the master, but response not yet processed by tserver. The new schedule is created and sent to the tablet. Then the tablet receives the response to this heart, and it would not contain such a schedule. To avoid interpreting such schedule as deleted we wait that `snapshot_schedules_version_` to be incremented twice, before marking schedule as deleted at tserver. Original commit: D11868 / 7453192 Test Plan: ybd --gtest_filter YbAdminSnapshotScheduleTest.Delete Jenkins: rebase: 2.6 Reviewers: skedia, bogdan Reviewed By: bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D12000
This is a good intro task to get familiar with the basics like how to add a client-side interface, define protobufs for communication between client and server, make RPC calls, understand the translation of table rows to rocksdb key-value pairs, perform a simple sys catalog write, etc.
|
Hey, @bmatican, if no one is working on this task currently I'd love to pick it up! |
@jordans6 Glad to hear you're excited to be contributing to YB. That sounds great, hopefully @sanketkedia 's message above helps as a starting point! If you need anymore help, feel free to join our public slack and ask questions on the #contributors channel! |
Awesome, I'm looking forward to contributing! Would you be able to send me a link to the slack channel? The one in the readme gives me the following error message: |
Hi @jordans6, here's a link for the slack: https://communityinviter.com/apps/yugabyte-db/register |
Thanks, I've joined the slack now. |
Is an interval of 0 or retention_duration of 0 valid? |
That's good point @druzac I would say we can just disallow 0 as the interval or retention instead of getting into the confusion and having to explain to users as to what means what? |
cc @vkulichenko |
Summary: Added edit_snapshot_schedule command to yb-admin, along with an RPC in master to edit a snapshot schedule. Moved validation of schedule invariants on the CreateSnapshotSchedule path from the command line tool to the server request handling code. Also added checks for 0 interval and 0 retention snapshot schedules to the CreateSnapshotSchedule path. Refactored MasterSnapshotCoordinator and SnapshotScheduleState to de-duplicate serialization to docdb. Test Plan: YbAdminSnapshotScheduleTest.EditInterval YbAdminSnapshotScheduleTest.EditRetention YbAdminSnapshotScheduleTest.EditSnapshotScheduleCheckOptions YbAdminSnapshotScheduleTest.EditIntervalZero YbAdminSnapshotScheduleTest.EditRetentionZero YbAdminSnapshotScheduleTest.EditRepeatedInterval YbAdminSnapshotScheduleTest.EditRepeatedRetention YbAdminSnapshotScheduleTest.EditIntervalLargerThanRetention YbAdminSnapshotScheduleTest.CreateIntervalZero YbAdminSnapshotScheduleTest.CreateRetentionZero YbAdminSnapshotScheduleTest.CreateIntervalLargerThanRetention YbAdminSnapshotScheduleTest.EditIntervalAndRetention Also played around with a test cluster on the command line. Reviewers: skedia Reviewed By: skedia Subscribers: bogdan, ybase Differential Revision: https://phabricator.dev.yugabyte.com/D17087
Jira Link: DB-2166
For edit, it would be tricky to expose modifying the filter, as we might expand to capture tables not previously captured, for which we wouldn't have old snapshots. Given that, let's just have an ability to change
interval
andretention
.For delete, we need to confirm that all relevant GC work is done appropriately
The text was updated successfully, but these errors were encountered: