-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed #93133
Comments
Upgrade starts around
The job is resumed at
It fails immediately,
with the following stack trace,
|
The same migration step errors out on
However,
Note, the timestamps suggest the following job execution ordering
No other upgrade progress is made past
|
It appears we have hit an inconsistent state during schema migration. @postamar Could we get some assistance from your team to (dis)qualify this as a bug? |
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ 8165e3974c10e88b6ae11c6255872ea16f3a67e3:
Parameters: |
Same issue as the previous run. This likely implies that it was some change that was merged into master on Dec 5. |
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ ec095bc2fdbe4e518b076db20e4920fab67222bf:
Parameters: Same failure on other branches
|
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ ec095bc2fdbe4e518b076db20e4920fab67222bf:
Parameters: |
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ 24854994805cede37e6845ee2a94e10272b5506b:
Parameters: |
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ f2b00e8039af6ea8887ec124dad8daf19da6fbf1:
Parameters: |
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ c050c9b4b57ecf2ceb5d449c31c617fe12c920e0:
Parameters: |
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ 942a4d468e9c8ad0ef45a7be33f0a326dfb19fef:
Parameters: |
…_idx The migration used a different column ordering than the descriptor in the bootstrap schema. The value in the bootstrap schema is the value used to determine whether the migration succeeded successfully. In general, you can hit this bug if you upgrade from 22.1->22.2 and then you create the index with the migration but crash before the index is fully created. In that case, the code will think that it's the wrong index. This should be rare, but would be problematic. Now we've made them match. This change also fixes the roachtest which checks that the system schema looks correct to check on what happens when you upgrade from a previous snapshot. The problem with the test is that it read the strings before they were assigned. Fixes cockroachdb#93133 Release note (bug fix): Fixed a rare bug which could cause upgrades from 22.1 to 22.2 to fail if the job coordinator node crashes in the middle of a specific upgrade migration.
The job was in fact going to be retried; this can be seen from the
That's because the job is |
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ a80652b2e4691ea76ea49e797b1b9e0998e1d61f:
Parameters: |
89442: build: delete `vendor` submodule r=knz,dt a=rickystewart The `vendor` submodule is not really necessary for anything any more. For the Bazel build, we have mirrored all of our dependencies, so vendoring provides no additional value. `go` tooling is also generally happy to point to the module cache for e.g. go-to-definition. Dealing with the submodule is a pain, so it behooves us to get rid of it. For `make` builds, the `vendor` directory will be synthesized automatically. It is now a `gitignore`'d directory. You can also still `make vendor_rebuild` if you want to force synthesizing the directory. Tooling can be updated to just not use `-mod=vendor` and the `go` module cache should be used in its place transparently. Epic: None Release note: None 92694: server, ui: add multitenant login/logout and tenant dropdown r=Santamaura a=Santamaura ui, server: add multitenant login/logout and tenant dropdown This patch enables login/logout for all tenants on the cluster by fanning out the incoming requests to each tenant server. Multitenant login introduces a new multitenant session cookie with the format as <session>,<tenant_name,<session2>,<tenant_name2> etc. The admin ui displays a dropdown with a list of tenants the user has successfully logged in to. Selecting a different tenant sets the tenant cookie to the selected tenant name and reloads the page. If the cluster is not multitenant, the dropdown will not display. Release note (ui change): added a top-level dropdown on the admin ui which lists tenants the user has logged in to. If not multitenant, the dropdown is not displayed. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-14546 93487: upgrades: fix upgrade to add statement_diagnostics_requests.completed… r=ajwerner a=ajwerner …_idx The migration used a different column ordering than the descriptor in the bootstrap schema. The value in the bootstrap schema is the value used to determine whether the migration succeeded successfully. In general, you can hit this bug if you upgrade from 22.1->22.2 and then you create the index with the migration but crash before the index is fully created. In that case, the code will think that it's the wrong index. This should be rare, but would be problematic. Now we've made them match. This change also augments the roachtest which checks that the system schema looks correct to check on what happens when you upgrade from a previous snapshot. That matters here because the migration in question still exists on master, and is not idempotent. We should have found that, but didn't because we need multiple steps in the upgrade. We can get that pretty cheaply. Fixes #93133 Release note (bug fix): Fixed a rare bug which could cause upgrades from 22.1 to 22.2 to fail if the job coordinator node crashes in the middle of a specific upgrade migration. Co-authored-by: Ricky Stewart <rickybstewart@gmail.com> Co-authored-by: Santamaura <alexsantamaura@gmail.com> Co-authored-by: Andrew Werner <awerner32@gmail.com>
…_idx The migration used a different column ordering than the descriptor in the bootstrap schema. The value in the bootstrap schema is the value used to determine whether the migration succeeded successfully. In general, you can hit this bug if you upgrade from 22.1->22.2 and then you create the index with the migration but crash before the index is fully created. In that case, the code will think that it's the wrong index. This should be rare, but would be problematic. Now we've made them match. This change also fixes the roachtest which checks that the system schema looks correct to check on what happens when you upgrade from a previous snapshot. The problem with the test is that it read the strings before they were assigned. Fixes #93133 Release note (bug fix): Fixed a rare bug which could cause upgrades from 22.1 to 22.2 to fail if the job coordinator node crashes in the middle of a specific upgrade migration.
roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on master @ 146556e19f5e4fdc8c3e6a623b280cc33aee4d18:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=16
,ROACHTEST_encrypted=true
,ROACHTEST_fs=ext4
,ROACHTEST_localSSD=true
,ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-22180
The text was updated successfully, but these errors were encountered: