Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: VReplication Sequence Initialization #13685

Closed
mattlord opened this issue Aug 2, 2023 · 0 comments · Fixed by #13656
Closed

Feature Request: VReplication Sequence Initialization #13685

mattlord opened this issue Aug 2, 2023 · 0 comments · Fixed by #13656

Comments

@mattlord
Copy link
Contributor

mattlord commented Aug 2, 2023

Feature Description

Introduction

This feature request is to propose a new feature to add some additional automation or protections around Vitess sequences specifically during a MoveTables workflow.

Context

When you import/move tables from an unshared keyspace to a sharded one as part of a MoveTables workflow, you need to be careful that you properly initialize any sequences used with a next_id higher than the highest value of the source table's auto_increment column.

This can end up being a bit of a foot gun when you have many sharded tables with auto_increment columns as it would require updating sequences for each table, giving a significant gap of next_ids to ensure you can run SwitchTraffic before the incrementing source databases tables' auto_inc column ids are higher than the sequence's next_id. (This is also a moving target unless writes are stopped on the source).

Goal of the Feature

Make MoveTables aware of any Vitess sequences being used by tables that are being moved from an unsharded keyspace to a sharded one. And then add a flag to either MoveTables or more specifically to the SwitchTraffic sub-command which will either:

Validate that each sequence referenced in the vschema has a next_id higher than the current highest id used in the auto_inc column on the source table. If this is not already the case then print an error and do not SwitchTraffic.

OR

Enable SwitchTraffic to manage the initialization of sequence tables with an appropriate next_id. In this scenario the operator creates the backing sequence tables on an unsharded keyspace, but leaves them empty/uninitialized and lets SwitchTraffic initialize the sequences, inserting appropriate (id, next_id,cache) values.

Use Case(s)

It's common for Vitess users to import external datasets that are not currently sharded, and shard them on import — or to move some tables out of unsharded keyspaces to sharded keyspaces as they grow in size. As it stands now sequences can be a major pitfall in this process. Having some built in checks/protections around this would remove a rough edge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant