Default CF lost after BR restore? #565

YuJuncen · 2020-10-23T08:39:32Z

Please answer these questions before submitting your issue. Thanks!

What did you do?
If possible, provide a recipe for reproducing the error.
Run test br_full_ddl.
What did you expect to see?
The test case success, or failed by network failure like context timeout exceeds.
What did you see instead?
default not found, see here
What version of BR and TiDB/TiKV/PD are you using?

See the log.

Notes

Unfortunately, all BR logs of this test case are lost. The pod was deleted. There are only two rows of error in TiKV1:

[2020-10-23T00:17:00.659Z] [2020/10/23 08:16:59.632 +08:00] [ERROR] [mod.rs:311] ["default value not found"] [hint=near_load_data_by_write] [key=7480000000000000465F72017573657236323835FF3732363334313135FF3538323132333500FE]
[2020-10-23T00:17:00.659Z] [2020/10/23 08:16:59.632 +08:00] [WARN] [endpoint.rs:596] [error-response] [err="default not found: key:7480000000000000465F72017573657236323835FF3732363334313135FF3538323132333500FE, maybe read truncated/dropped table data?"]

Maybe we have to wait for its next occurrence for more details. Currently, we only know it happens in the checksumming stage.

Some guessing of possible cause:

In the restore pipeline, some table not fully restored is sent to checksum.
(TBD)

PD: d0430729845d309370d8d4604bda991fc64fc7f8
TiDB: 45b65d16eb3f51f6b9a2a0790b3b743dcf8b154f
TiKV: 417be27592712f3c752ec8e4c1d4520fe50aae5c

Logs: defaultnotfound.zip

The text was updated successfully, but these errors were encountered:

3pointer · 2020-11-02T12:34:30Z

After reproduce. we found that this issue could happen when backup with cluster_index disabled, and restore with cluster_index enabled.

we catch the backup sst files in test. and restore to new cluster with cluster_index enabled. reproduce the same error.

with same backup sst files, and just disable the cluster_index in the same cluster.

it will restore normally.

overvenus · 2020-11-05T08:19:28Z

How to fix this issue?

YuJuncen · 2020-11-11T11:21:27Z

Possible solution: Backing up the tidb_enable_clustered_index and when creating tables, set this variable according to it.

But I fell perplexed that we even didn't touch this variable in the ddl_full test case, why this test failed by default not found...?

3pointer · 2021-01-25T03:16:58Z

Possible solution: Backing up the tidb_enable_clustered_index and when creating tables, set this variable according to it.

But I fell perplexed that we even didn't touch this variable in the ddl_full test case, why this test failed by default not found...?

...yes, that's strange. I guess it's a kind of CI problem. Anyway, I copied the backup data from the CI container. and change the cluster_index configuration. it can be restored. So I think we can close it for now.

YuJuncen added type/bug Something isn't working difficulty/3-hard Hard issue labels Oct 23, 2020

3pointer self-assigned this Nov 4, 2020

jebter added the severity/moderate label Jan 19, 2021

3pointer closed this as completed Jan 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default CF lost after BR restore? #565

Default CF lost after BR restore? #565

YuJuncen commented Oct 23, 2020 •

edited by overvenus

Loading

3pointer commented Nov 2, 2020

overvenus commented Nov 5, 2020

YuJuncen commented Nov 11, 2020

3pointer commented Jan 25, 2021

Default CF lost after BR restore? #565

Default CF lost after BR restore? #565

Comments

YuJuncen commented Oct 23, 2020 • edited by overvenus Loading

Notes

3pointer commented Nov 2, 2020

overvenus commented Nov 5, 2020

YuJuncen commented Nov 11, 2020

3pointer commented Jan 25, 2021

YuJuncen commented Oct 23, 2020 •

edited by overvenus

Loading