Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't set options in vdiff resume #130

Closed
wants to merge 2 commits into from
Closed

Conversation

shanth96
Copy link

@shanth96 shanth96 commented Nov 15, 2023

Fixes https://github.com/Shopify/vitess-project/issues/498.

Picks the relevant fix from vitessio#13976

Currently vdiff resume doesn't use the same options as that was originally set when creating the vdiff. This can lead to a whole range of issues as mentioned in the issue. This PR fixes it by not updating the options field when resuming a vdiff.

.gitignore Outdated Show resolved Hide resolved
@austenLacy
Copy link

I like that the fix resumes with the original options, but I guess it also technically removes the ability to pause and resume with different options if desired.

@shanth96
Copy link
Author

I like that the fix resumes with the original options, but I guess it also technically removes the ability to pause and resume with different options if desired.

Correct. Although not sure if we'd ever want to resume with a different set of options - we would instead create a new vdiff workflow

Signed-off-by: Shanth Sathiyaseelan <shanth.sathiyaseelan@shopify.com>
@brendar
Copy link

brendar commented Nov 17, 2023

There are some other changes in the upstream PR related to options handling on resume in go/vt/vttablet/tabletmanager/vdiff/action.go

These changes in particular:

+       vdiffRecord := qr.Named().Row()
+       if vdiffRecord == nil {
+               return fmt.Errorf("unable to %s vdiff for UUID %s as it was not found on tablet %v (%w)",
+                       action, req.VdiffUuid, vde.thisTablet.Alias, err)
+       }
+       if action == ResumeAction {
+               // Use the existing options from the vdiff record.
+               options = optionsZeroVal
+               err = protojson.Unmarshal(vdiffRecord.AsBytes("options", []byte("{}")), options)
+               if err != nil {
+                       return err
+               }
+       }
+
        vde.mu.Lock()
        defer vde.mu.Unlock()
-       if err := vde.addController(qr.Named().Row(), options); err != nil {
+       if err := vde.addController(vdiffRecord, options); err != nil {
                return err
        }

Are any of those necessary?

@shanth96
Copy link
Author

There are some other changes in the upstream PR related to options handling on resume in go/vt/vttablet/tabletmanager/vdiff/action.go

These changes in particular:

+       vdiffRecord := qr.Named().Row()
+       if vdiffRecord == nil {
+               return fmt.Errorf("unable to %s vdiff for UUID %s as it was not found on tablet %v (%w)",
+                       action, req.VdiffUuid, vde.thisTablet.Alias, err)
+       }
+       if action == ResumeAction {
+               // Use the existing options from the vdiff record.
+               options = optionsZeroVal
+               err = protojson.Unmarshal(vdiffRecord.AsBytes("options", []byte("{}")), options)
+               if err != nil {
+                       return err
+               }
+       }
+
        vde.mu.Lock()
        defer vde.mu.Unlock()
-       if err := vde.addController(qr.Named().Row(), options); err != nil {
+       if err := vde.addController(vdiffRecord, options); err != nil {
                return err
        }

Are any of those necessary?

Good catch. I was debugging why the tests are failing and I think that might be it. Although it's hard to tell the diff b/w flakiness vs actual failures.

Copy link

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added the Stale label Dec 28, 2023
Copy link

github-actions bot commented Jan 4, 2024

This PR was closed because it has been stale for 7 days with no activity.

@github-actions github-actions bot closed this Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants