-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importccl,backupccl: use crdb_internal.invalid_objects
to check for invalid descriptors
#84757
Labels
A-disaster-recovery
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
stability-period
Intended to be worked on during a stability period. Use with the Milestone field to specify version.
T-disaster-recovery
Comments
adityamaru
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-disaster-recovery
labels
Jul 20, 2022
cc @cockroachdb/bulk-io |
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Jul 28, 2022
…anup This patch adds a test helper function to easily check for invalid descriptors during the cleanup of a test cluster created by any of the backupRestoreTestSetup* helper funcs. This setup is used by many backup unit tests (including all data driven tests), but a future PR should include this clean up helper func in any backup unit test that doesn't include use these helper funcs. Informs cockroachdb#84757 Release note: none
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Jul 29, 2022
…anup This patch adds a test helper function to easily check for invalid descriptors during the cleanup of a test cluster created by any of the backupRestoreTestSetup* helper funcs. This setup is used by many backup unit tests (including all data driven tests), but a future PR should include this clean up helper func in any backup unit test that doesn't include use these helper funcs. Informs cockroachdb#84757 Release note: none
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Jul 29, 2022
…anup This patch adds a test helper function to easily check for invalid descriptors during the cleanup of a test cluster created by any of the backupRestoreTestSetup* helper funcs. This setup is used by many backup unit tests (including all data driven tests), but a future PR should include this clean up helper func in any backup unit test that doesn't include use these helper funcs. Informs cockroachdb#84757 Release note: none
msbutler
added
the
stability-period
Intended to be worked on during a stability period. Use with the Milestone field to specify version.
label
Aug 25, 2022
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Sep 26, 2022
…anup This patch adds a test helper function to easily check for invalid descriptors during the cleanup of a test cluster created by any of the backupRestoreTestSetup* helper funcs. This setup is used by many backup unit tests (including all data driven tests), but a future PR should include this clean up helper func in any backup unit test that doesn't include use these helper funcs. It is also worth noting this check would have caught cockroachdb#76764. Informs cockroachdb#84757 Release note: none
craig bot
pushed a commit
that referenced
this issue
Sep 26, 2022
85282: backupccl: check crdb_internal.invalid_objects during backup test cleanup r=adityamaru a=msbutler This patch adds a test helper function to easily check for invalid descriptors during the cleanup of a test cluster created by any of the backupRestoreTestSetup* helper funcs. This setup is used by many backup unit tests (including all data driven tests), but a future PR should include this clean up helper func in any backup unit test that doesn't include use these helper funcs. It's also worth noting this check would have caught #76764. Informs #84757 Release note: none 87820: sql/schemachanger/*: optimize performance, primarily by adding support for containment r=ajwerner a=ajwerner #### screl,scplan/rule: adopt containment #### sql/schemachanger/rel: support inverted indexes and slice containment This commit introduces some new concepts to rel: * Slice attributes * Inverted indexing of slice attributes * Containment queries over slice attributes One note is that the only support for containment queries is via an inverted index. In the future, we could add direct containment evaluation such that if we have a slice we could go iterate its members. The basic idea is to introduce a slice membership type for the slice column and to use the attribute in question to refer to the slice member value. #### sql/schemachanger/rel: fix embedding support #### sql/schemachanger/rel: optimize error checking This change ends up making a huge difference. Otherwise, each subquery invocation would lead to allocations which showed up in the profiles. Fixes #86042 Release note: None 88712: sql: enable `IdempotentTombstone` for schema GC r=ajwerner a=erikgrinaker `@ajwerner` Looks like we forgot to enable this. Let's get it in before the backport freeze. --- This will avoid writing MVCC range tombstones across ranges if they don't contain any live data. This is particularly useful to avoid writing additional tombstones on retries. Release note: None Co-authored-by: Michael Butler <butler@cockroachlabs.com> Co-authored-by: Andrew Werner <awerner32@gmail.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
blathers-crl bot
pushed a commit
that referenced
this issue
Sep 27, 2022
…anup This patch adds a test helper function to easily check for invalid descriptors during the cleanup of a test cluster created by any of the backupRestoreTestSetup* helper funcs. This setup is used by many backup unit tests (including all data driven tests), but a future PR should include this clean up helper func in any backup unit test that doesn't include use these helper funcs. It is also worth noting this check would have caught #76764. Informs #84757 Release note: none
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Nov 4, 2022
This patch checks that the restored descriptors are valid in two places during the job. First, as soon as the descriptors are written in an offline state before the restoration flow begins. This primarily ensures that the descriptors in the backup were in a valid state, allowing the restore to fail fast. Second, right after the descriptors return online and before the job begins, which ensures the restore created and modified the descriptors correctly. Fixes cockroachdb#84757 Release note: None
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Nov 14, 2022
This patch checks that the restored descriptors are valid in three places during the job. First, during planning to ensure no existing descriptor in the cluster is invalid. Second, as soon as the descriptors are written in an offline state before the restoration flow begins. This primarily ensures that the descriptors in the backup were in a valid state, allowing the restore to fail fast. Third, right after the descriptors return online and before the job begins, which ensures the restore created and modified the descriptors correctly. Note that this validation step does not check job references in the descriptors for 2 reasons: 1) it would require loading the whole jobs table into memory, 2) during the second check, the offline written descriptors will have invalid job references during a cluster restore, as the jobs table isn't properly populated yet. Fixes cockroachdb#84757 Release note: None
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Nov 14, 2022
This patch checks that the restored descriptors are valid in three places during the job. First, during planning to ensure no existing descriptor in the cluster is invalid. Second, as soon as the descriptors are written in an offline state before the restoration flow begins. This primarily ensures that the descriptors in the backup were in a valid state, allowing the restore to fail fast. Third, right after the descriptors return online and before the job begins, which ensures the restore created and modified the descriptors correctly. Note that this validation step does not check job references in the descriptors for 2 reasons: 1) it would require loading the whole jobs table into memory, 2) during the second check, the offline written descriptors will have invalid job references during a cluster restore, as the jobs table isn't properly populated yet. Fixes cockroachdb#84757 Release note: None
msbutler
added a commit
to msbutler/cockroach
that referenced
this issue
Nov 15, 2022
This patch checks that the restored descriptors are valid in three places during the job. First, during planning to ensure no existing descriptor in the cluster is invalid. Second, as soon as the descriptors are written in an offline state before the restoration flow begins. This primarily ensures that the descriptors in the backup were in a valid state, allowing the restore to fail fast. Third, right after the descriptors return online and before the job begins, which ensures the restore created and modified the descriptors correctly. Note that this validation step does not check job references in the descriptors for 2 reasons: 1) it would require loading the whole jobs table into memory, 2) during the second check, the offline written descriptors will have invalid job references during a cluster restore, as the jobs table isn't properly populated yet. If the customer needs to proceed with the restore, they can prevent the restore from failing by passing the skip_validation_check flag. Fixes cockroachdb#84757 Release note: this patch introduces checks during restore which cause the restore to fail if there exist invalid descriptors in the restoring cluster. The user can prevent the restore from failing by passing the skip_validation_check flag.
As discussed in detail here, there's no need for restore to conduct manual descriptor validation, as the sql descriptor collection read/write machinery does it for us. Will close this issue out, once I add a test that verifies this fact. |
craig bot
pushed a commit
that referenced
this issue
Nov 30, 2022
89358: memo: better selectivity estimates on single-table ORed predicates r=rytaft,michae2 a=msirek Fixes #75090 Currently, the optimizer only does a good job of estimating the selectivity of single-table ORed predicates when all disjuncts are tight constraints. Otherwise we give up and go with the default selectivity of 1/3 for the entire filter. If any of the disjuncts has a selectivity above 1/3, then the total selectivity should be at least as high as the selectivity of that disjunct. Any failure to find a tight constraint for a given disjunct should only cause a selectivity of 1/3 to be used for that disjunct, not the entire filter. This is addressed by using the method of combining ORed selectivities introduced by #74303: ``` Given predicates A and B and probability function P: P(A or B) = P(A) + P(B) - P(A and B) ``` The above formula is applied n-1 times, given n disjuncts. For each disjunct which is not a tight constraint, 1/3 is used for that predicate's selectivity in the formula. Release note (bug fix): This patch fixes incorrect selectivity estimation for queries with ORed predicates all referencing a common single table. 91040: cloud/amazon: support external ID in AWS assume role r=rhu713 a=rhu713 Support passing in the optional external ID when assuming a role. This is done by extending the values of the comma-separated string value of the ASSUME_ROLE parameter to the format `<role>;external_id=<id>`. Users can still use the previous format of just `<role>` to specify a role without any external ID. When using role chaining, each role in the chain can be associated with a different external ID. For example: `ASSUME_ROLE=<roleA>;external_id=<idA>,<roleB>;external_id=<idB>,<roleC>` will use external ID `<idA>` to assume delegate `<roleA>`, then use external ID `<idB>` to assume delegate `<roleB>`, and finally no external ID to assume the final role `<roleC>`. Additional documentation about external IDs can be found here: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html Release note (enterprise change): Support passing in the optional external ID when assuming a role. This is done by extending the values of the comma-separated string value of the ASSUME_ROLE parameter to the format `<role>;external_id=<id>`. Users can still use the previous format of just `<role>` to specify a role without any external ID. When using role chaining, each role in the chain can be associated with a different external ID. For example: `ASSUME_ROLE=<roleA>;external_id=<idA>,<roleB>;external_id=<idB>,<roleC>` will use external ID `<idA>` to assume delegate `<roleA>`, then use external ID `<idB>` to assume delegate `<roleB>`, and finally no external ID to assume the final role `<roleC>`. Addresses #90239 92341: changefeedccl: fix flakey tests r=[miretskiy] a=HonoreDB Our test SQL connection doesn't always read its own writes: it can use different connections for different queries. So ``` INSERT INTO foo; feed(CREATE CHANGEFEED FOR foo) ``` is a race condition. This just doesn't matter for most tests because if the insert happens after the initial scan, it creates the same payload. But tests with initial_scan='only' won't see it at all. Fixes #92211 Release note: None 92462: scripts: improve pprof-loop.sh r=andrewbaptist a=tbg Now it also works with the runtime trace endpoint (worked before but printed concerning-looking errors) and with non-blocking endpoints (cumulative profiles such as heap or mutex), for which it previously busy-looped. Epic: None Release note: None 92591: ui: add 99.9th and 99.99th SQL latency charts r=maryliag a=maryliag This commit introduces the charts for: `Service Latency: SQL Statements, 99.99th percentile` and `Service Latency: SQL Statements, 99.9th percentile` on Metrics page under SQL view. Fixes #74247 <img width="806" alt="Screen Shot 2022-11-28 at 12 23 56 PM" src="https://user-images.githubusercontent.com/1017486/204342077-b5509f51-f94e-44be-a2e8-8bd185d12ce6.png"> Release note (ui change): Added charts `Service Latency: SQL Statements, 99.9th percentile` and `Service Latency: SQL Statements, 99.99th percentile` to Metrics page, under SQL view. 92635: tree: fix internal error comparing tuple type with non-tuple type r=rytaft,mgartner a=msirek Fixes #91532 This fixes comparison of an empty tuple expression, ROW() or (), or a non-empty tuple expression with a non-tuple type by returning an error message instead of an internal error. Release note (bug fix): This fixes an internal error when comparing a tuple type with a non-tuple type. 92636: backupccl: add restore corrupt descriptor validation test r=adityamaru a=msbutler This patch adds a test that ensures that restore automatically detects corrupt restoring descriptors via the sql descriptor collection read/write machinery. Fixes #84757 Release note: None 92692: jobs: protected timestamps refresh logic is not transactional r=fqazi a=fqazi Fixes: #92427 Previously, the refresh logic for protected timestamps would fetch jobs with transactions before attempting to update an existing one. Due to a recent change to allow this API to reflect any changes into individual job objects, we no longer do that correctly. This patch fixes the protected timestamp manager to update timestamps in a transactionally, safe manner since multiple updates can be happening concurrently for schema changes. Release note: None Co-authored-by: Mark Sirek <sirek@cockroachlabs.com> Co-authored-by: Rui Hu <rui@cockroachlabs.com> Co-authored-by: Aaron Zinger <zinger@cockroachlabs.com> Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: maryliag <marylia@cockroachlabs.com> Co-authored-by: Michael Butler <butler@cockroachlabs.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-disaster-recovery
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
stability-period
Intended to be worked on during a stability period. Use with the Milestone field to specify version.
T-disaster-recovery
SQL Schema have a virtual table to check for invalid descriptors in a cluster. Restore and import are in the business of creating and ingesting descriptors, and so we should check the virtual table post restore/import for any invalid descriptor states they might have generated. This should be added to as many restore/import tests as possible, but also potentially as a sanity check during a restore/import once the descriptors have been published to an online state.
Jira issue: CRDB-17847
The text was updated successfully, but these errors were encountered: