-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Rollback DocDB operations on an unsuccessful ADD COLUMN/ALTER TABLE rewrite operation #22802
Closed
1 task done
Labels
2.20 Backport Required
2024.1 Backport Required
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
fizaaluthra
added
area/ysql
Yugabyte SQL (YSQL)
status/awaiting-triage
Issue awaiting triage
labels
Jun 10, 2024
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
labels
Jun 10, 2024
fizaaluthra
added a commit
that referenced
this issue
Jun 27, 2024
Summary: As part of legacy table rewrite operations that clone the table, the old table is renamed to <table_name>_temp_old. Without DDL atomicity, if the operation fails at any point after the rename, the PG metadata is rolled back (relname is rolled back to <table_name>), but the DocDB table name is still <table_name>_temp_old. Note: the same rename is also done for the table's indexes. These renamed DocDB tables cause issues with backup/restore as the restore operation relies on table names. We can avoid these issues by avoiding the DocDB level renames altogether, as the old tables are going to be dropped anyway. Now, after a failed ADD/DROP pkey, ALTER TYPE operation (using the legacy rewrite), the DocDB table will no longer be left with the name <table_name>_temp_old. However, there will now be two DocDB tables with the same table name after a failed legacy rewrite operation. The async transaction cleanup task on master will lookup the entry for this orphaned table in pg_class (using the pg oid in the DocDB table id), and will drop the table upon finding that it doesn't exist. If, for some reason, the orphaned table is not cleaned up, or if a back up is taken before the cleanup happens, the backup code will simply ignore this orphaned table. Specifically, RepackSnapshotsForBackup will not find the pg schema name metadata for this table, and will simply ignore it. Therefore, there will be no ambiguity on restore side (which relies on the table names). Jira: DB-11703 Test Plan: ./yb_build.sh --cxx-test yb-backup-cross-feature-test --gtest-filter YBBackupTest.TestBackupWithFailedLegacyRewrite Reviewers: myang Reviewed By: myang Subscribers: yql, ybase Differential Revision: https://phorge.dev.yugabyte.com/D35943
jasonyb
pushed a commit
that referenced
this issue
Jun 28, 2024
Summary: 9c637e2 [PLAT-14429]: Modify Troubleshooting Platform registration workflow in YBA 0a1406d [PLAT-14098]: Updating yb.runtime_conf_ui.tag_filter with appropriate tags value does not display the flags accordingly 70a87f9 [PLAT-13605][PLAT-13609]: Edit Volume controls and storage type in FULL_MOVE but not in case of UPDATE 26fbfe0 [PLAT-14515][UI] Clicking preview doesn't show the correct info and clears up the data provided while setting up the ysql_ident or ysql_hba multiline flags.- [PLAT-14514] [PLAT-14513] a07946b [#18233] Initial commit for yugabyted unit test framework. b2e8ee7 [#22842] docdb: Improve usability of stack trace tracking endpoints 508f26e [docs] Added RN 2.20.2.3-b2 (#23042) 214d44a [#22935] YSQL: Use db oid in the tserver's sequence cache entry key c47b2d9 [#22802] YSQL: Avoid renaming DocDb tables during legacy rewrite Excluded: 7c8343d [#22874] YSQL: Fix cascaded drops on columns 58c8d4e [#23046] xCluster: Remove ns_replication from the code a70681d [#22816] YSQL: Bug fixes for replication connections in ysql connection manager b239e07 Doc upgrade versions (#22988) Excluded: ec76062 [#16408] YSQL: Split TestPgRegressIndex.testPgRegressIndex Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36247
fizaaluthra
added a commit
that referenced
this issue
Jul 1, 2024
…gacy rewrite Summary: As part of legacy table rewrite operations that clone the table, the old table is renamed to <table_name>_temp_old. Without DDL atomicity, if the operation fails at any point after the rename, the PG metadata is rolled back (relname is rolled back to <table_name>), but the DocDB table name is still <table_name>_temp_old. Note: the same rename is also done for the table's indexes. These renamed DocDB tables cause issues with backup/restore as the restore operation relies on table names. We can avoid these issues by avoiding the DocDB level renames altogether, as the old tables are going to be dropped anyway. Now, after a failed ADD/DROP pkey, ALTER TYPE operation (using the legacy rewrite), the DocDB table will no longer be left with the name <table_name>_temp_old. However, there will now be two DocDB tables with the same table name after a failed legacy rewrite operation. The async transaction cleanup task on master will lookup the entry for this orphaned table in pg_class (using the pg oid in the DocDB table id), and will drop the table upon finding that it doesn't exist. If, for some reason, the orphaned table is not cleaned up, or if a back up is taken before the cleanup happens, the backup code will simply ignore this orphaned table. Specifically, RepackSnapshotsForBackup will not find the pg schema name metadata for this table, and will simply ignore it. Therefore, there will be no ambiguity on restore side (which relies on the table names). Jira: DB-11703 Original commit: c47b2d9 / D35943 Test Plan: On an alma8 release build: ./yb_build.sh release --cxx-test tools_yb-backup-cross-feature-test --gtest_filter YBBackupTest.TestBackupWithFailedLegacyRewrite -n 100 Reviewers: myang Reviewed By: myang Subscribers: ybase, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36219
fizaaluthra
added a commit
that referenced
this issue
Jul 8, 2024
…cy rewrite Summary: Note: this backport differs from the original commit -- the test TestBackupWithFailedLegacyRewrite is renamed to TestBackupWithFailedRewrite. Additionally, commit d5d7363 is only partially backported -- specifically, only changes related to some of the DDL atomicity tests that rely on the `_temp_old` naming are backported As part of legacy table rewrite operations that clone the table, the old table is renamed to <table_name>_temp_old. Without DDL atomicity, if the operation fails at any point after the rename, the PG metadata is rolled back (relname is rolled back to <table_name>), but the DocDB table name is still <table_name>_temp_old. Note: the same rename is also done for the table's indexes. These renamed DocDB tables cause issues with backup/restore as the restore operation relies on table names. We can avoid these issues by avoiding the DocDB level renames altogether, as the old tables are going to be dropped anyway. Now, after a failed ADD/DROP pkey, ALTER TYPE operation (using the legacy rewrite), the DocDB table will no longer be left with the name <table_name>_temp_old. However, there will now be two DocDB tables with the same table name after a failed legacy rewrite operation. The async transaction cleanup task on master will lookup the entry for this orphaned table in pg_class (using the pg oid in the DocDB table id), and will drop the table upon finding that it doesn't exist. If, for some reason, the orphaned table is not cleaned up, or if a back up is taken before the cleanup happens, the backup code will simply ignore this orphaned table. Specifically, RepackSnapshotsForBackup will not find the pg schema name metadata for this table, and will simply ignore it. Therefore, there will be no ambiguity on restore side (which relies on the table names). Jira: DB-11703 Original commits: c47b2d9 / D35943 d5d7363 / D29786 Test Plan: ./yb_build.sh --cxx-test yb-backup-cross-feature-test --gtest-filter YBBackupTest.TestBackupWithFailedLegacyRewrite Reviewers: myang Reviewed By: myang Subscribers: ybase, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36220
fizaaluthra
added a commit
that referenced
this issue
Jul 8, 2024
…cy rewrite Summary: Note: this backport differs from the original commit -- the test TestBackupWithFailedLegacyRewrite is renamed to TestBackupWithFailedRewrite. As part of legacy table rewrite operations that clone the table, the old table is renamed to <table_name>_temp_old. Without DDL atomicity, if the operation fails at any point after the rename, the PG metadata is rolled back (relname is rolled back to <table_name>), but the DocDB table name is still <table_name>_temp_old. Note: the same rename is also done for the table's indexes. These renamed DocDB tables cause issues with backup/restore as the restore operation relies on table names. We can avoid these issues by avoiding the DocDB level renames altogether, as the old tables are going to be dropped anyway. Now, after a failed ADD/DROP pkey, ALTER TYPE operation (using the legacy rewrite), the DocDB table will no longer be left with the name <table_name>_temp_old. However, there will now be two DocDB tables with the same table name after a failed legacy rewrite operation. The async transaction cleanup task on master will lookup the entry for this orphaned table in pg_class (using the pg oid in the DocDB table id), and will drop the table upon finding that it doesn't exist. If, for some reason, the orphaned table is not cleaned up, or if a back up is taken before the cleanup happens, the backup code will simply ignore this orphaned table. Specifically, RepackSnapshotsForBackup will not find the pg schema name metadata for this table, and will simply ignore it. Therefore, there will be no ambiguity on restore side (which relies on the table names). Jira: DB-11703 Original commit: c47b2d9 / D35943 Test Plan: ./yb_build.sh --cxx-test yb-backup-cross-feature-test --gtest-filter YBBackupTest.TestBackupWithFailedLegacyRewrite Reviewers: myang Reviewed By: myang Subscribers: ybase, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36222
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2.20 Backport Required
2024.1 Backport Required
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-11703
Description
On older branches (2.20 and beyond), when ADD COLUMN / ALTER TABLE rewrite operations fail, make an effort to rollback the DocDB changes.
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: