Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatible issue when using lightning + SHARD_ROW_ID_BITS + AUTO_ID_CACHE=1 #52654

Closed
jackysp opened this issue Apr 17, 2024 · 7 comments · Fixed by #52712
Closed

Compatible issue when using lightning + SHARD_ROW_ID_BITS + AUTO_ID_CACHE=1 #52654

jackysp opened this issue Apr 17, 2024 · 7 comments · Fixed by #52712
Assignees
Labels
affects-6.5 affects-7.1 affects-7.5 affects-8.1 component/lightning This issue is related to Lightning of TiDB. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@jackysp
Copy link
Member

jackysp commented Apr 17, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE t (
  id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (id) /*T![clustered_index] NONCLUSTERED */
) /*T![auto_id_cache] AUTO_ID_CACHE=1 */ /*T! SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=3 */;

INSERT INTO t VALUES (1778961125641936898);
  1. dumpling the data then drop table t and using lightning physical mode to import data.

  2. insert into t values ()

2. What did you expect to see? (Required)

No error, because we can continues insert rows like,

CREATE TABLE t (
  id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (id) /*T![clustered_index] NONCLUSTERED */
) /*T![auto_id_cache] AUTO_ID_CACHE=1 */ /*T! SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=3 */;

INSERT INTO t VALUES (1778961125641936898);
INSERT INTO t VALUES ();

3. What did you see instead (Required)

mysql> insert into t values ();
ERROR 1467 (HY000): Failed to read auto-increment value from storage engine

4. What is your TiDB version? (Required)

5bb8ed7

@jackysp jackysp added the type/bug The issue is confirmed as a bug. label Apr 17, 2024
@jackysp
Copy link
Member Author

jackysp commented Apr 17, 2024

If we don't use AUTO_ID_CACHE = 1, we will get

CREATE TABLE t (
  id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (id) /*T![clustered_index] NONCLUSTERED */
) /*T! SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=3 */;

INSERT INTO t VALUES (1778961125641936898); -- ERROR 1467 (HY000): Failed to read auto-increment value from storage engine 

which make senses.

@tiancaiamao
Copy link
Contributor

OK, I know how the bug come in.

After we alloc an ID, we'll check the overflow of it:

base, maxID, err := t.Allocators(mctx).Get(autoid.RowIDAllocType).Alloc(ctx, n, 1, 1)
if err != nil {
return 0, 0, err
}
if meta.ShardRowIDBits > 0 {
shardFmt := autoid.NewShardIDFormat(types.NewFieldType(mysql.TypeLonglong), meta.ShardRowIDBits, autoid.RowIDBitLength)
// Use max record ShardRowIDBits to check overflow.
if OverflowShardBits(maxID, meta.MaxShardRowIDBits, autoid.RowIDBitLength, true) {
// If overflow, the rowID may be duplicated. For examples,
// t.meta.ShardRowIDBits = 4
// rowID = 0010111111111111111111111111111111111111111111111111111111111111
// shard = 0100000000000000000000000000000000000000000000000000000000000000
// will be duplicated with:
// rowID = 0100111111111111111111111111111111111111111111111111111111111111
// shard = 0010000000000000000000000000000000000000000000000000000000000000
return 0, 0, autoid.ErrAutoincReadFailed
}
shard := mctx.GetSessionVars().GetCurrentShard(int(n))
base = shardFmt.Compose(shard, base)
maxID = shardFmt.Compose(shard, maxID)
}

But the code only check the RowID, not the AutoIncrementID... in the old code, row id and auto_increment share the same allocator.

After this commit, we separate row id and auto_increment id.
So the overflow check code does not work any more on AUTO_ID_CACHE=1 cases.
#39041

@tiancaiamao
Copy link
Contributor

Create a table without AUTO_ID_CACHE=1, insert 1778961125641936898 report error,
create a table using AUTO_ID_CACHE=1, insert 1778961125641936898 no error,
that is expected behaviour.

Because when using AUTO_ID_CACHE=1, row id and auto increment id use different allocator.
SHARD_ROW_ID_BITS works on rowid bits, so it does not affect the auto increment id, and not cause the overflow.

@tiancaiamao
Copy link
Contributor

So the problem is in step2 & step3.

When we dump + lightning the database, the error occur.
I'm not sure how the _tidb_rowid is handled during that process.

CREATE TABLE t (
  id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (id) /*T![clustered_index] NONCLUSTERED */
) /*T![auto_id_cache] AUTO_ID_CACHE=1 */ /*T! SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=3 */;

INSERT INTO t VALUES (1778961125641936898);

mysql> select _tidb_rowid, id from t1;
+---------------------+---------------------+
| _tidb_rowid         | id                  |
+---------------------+---------------------+
| 2882303761517117441 | 1778961125641936898 |
+---------------------+---------------------+
1 row in set (0.00 sec)

When we dumpling and lightning this database, the new value is:

mysql> select _tidb_rowid, id from t;
+---------------------+---------------------+
| _tidb_rowid         | id                  |
+---------------------+---------------------+
| 1152921504606846977 | 1778961125641936898 |
+---------------------+---------------------+
1 row in set (0.01 sec)

As you can see the _tidb_rowid changed.
The next insert would fail with this call stack:

goroutine 798 [running]:
runtime/debug.Stack()
        /home/genius/project/go/src/runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
        /home/genius/project/go/src/runtime/debug/stack.go:16 +0x13
github.com/pingcap/tidb/table/tables.allocHandleIDs({0x5cc3000, 0xc005a484e0}, {0x5d34670, 0xc0010eef00}, {0x5cf05a8, 0xc0048203c0}, 0x0?)
        /home/genius/project/src/github.com/pingcap/tidb/table/tables/tables.go:1656 +0x1ce
github.com/pingcap/tidb/table/tables.(*TableCommon).AddRecord(0xc0048203c0, {0x5d34670, 0xc0010eef00}, {0xc005a4a730, 0x1, 0x1}, {0xc0059d03e0, 0x2, 0xc0019e9cb0?})
        /home/genius/project/src/github.com/pingcap/tidb/table/tables/tables.go:841 +0x6b6
github.com/pingcap/tidb/executor.(*InsertValues).addRecordWithAutoIDHint(0xc0050bd180, {0x5cc3000?, 0xc005a484e0}, {0xc005a4a730, 0x1, 0x1}, 0x1)
        /home/genius/project/src/github.com/pingcap/tidb/executor/insert_common.go:1398 +0x19f
github.com/pingcap/tidb/executor.(*InsertExec).exec(0xc005a42660, {0x5cc3000, 0xc005a484e0}, {0xc005794fa8?, 0x1, 0x1})
        /home/genius/project/src/github.com/pingcap/tidb/executor/insert.go:104 +0x6b5
github.com/pingcap/tidb/executor.insertRows({0x5cc3000, 0xc005a484e0}, {0x5ca1558, 0xc005a42660})
        /home/genius/project/src/github.com/pingcap/tidb/executor/insert_common.go:284 +0x3d4
github.com/pingcap/tidb/executor.(*InsertExec).Next(0xc005a42660, {0x5cc3000?, 0xc005a483f0?}, 0x1?)
        /home/genius/project/src/github.com/pingcap/tidb/executor/insert.go:306 +0x1d4
github.com/pingcap/tidb/executor.Next({0x5cc3000, 0xc005a483f0}, {0x5cc5da0, 0xc005a42660}, 0xc0042f3ae0)
        /home/genius/project/src/github.com/pingcap/tidb/executor/executor.go:326 +0x299
github.com/pingcap/tidb/executor.(*ExecStmt).next(0xc005462ff0, {0x5cc3000, 0xc005a483f0}, {0x5cc5da0, 0xc005a42660}, 0x4ed77c0?)
        /home/genius/project/src/github.com/pingcap/tidb/executor/adapter.go:1202 +0x6e
github.com/pingcap/tidb/executor.(*ExecStmt).handleNoDelayExecutor(0xc005462ff0, {0x5cc3000?, 0xc005a483f0?}, {0x5cc5da0?, 0xc005a42660})
        /home/genius/project/src/github.com/pingcap/tidb/executor/adapter.go:947 +0x396
github.com/pingcap/tidb/executor.(*ExecStmt).handleNoDelay(0xc005462ff0, {0x5cc3000, 0xc005a483f0}, {0x5cc5da0?, 0xc005a42660?}, 0x0)
        /home/genius/project/src/github.com/pingcap/tidb/executor/adapter.go:773 +0x252
github.com/pingcap/tidb/executor.(*ExecStmt).Exec(0xc005462ff0, {0x5cc3000, 0xc005a483f0})
        /home/genius/project/src/github.com/pingcap/tidb/executor/adapter.go:568 +0xbc7
github.com/pingcap/tidb/session.runStmt({0x5cc3000?, 0xc005770b70?}, 0xc0010eef00, {0x5cd2fe0, 0xc005462ff0?})
        /home/genius/project/src/github.com/pingcap/tidb/session/session.go:2411 +0x52a
github.com/pingcap/tidb/session.(*session).ExecuteStmt(0xc0010eef00, {0x5cc3000?, 0xc005770b70?}, {0x5cd6f80?, 0xc0057ba200?})
        /home/genius/project/src/github.com/pingcap/tidb/session/session.go:2260 +0x1090
github.com/pingcap/tidb/server.(*TiDBContext).ExecuteStmt(0xc00015d6b0, {0x5cc3000, 0xc005770b70}, {0x5cd6f80?, 0xc0057ba200})
        /home/genius/project/src/github.com/pingcap/tidb/server/driver_tidb.go:294 +0xa7
github.com/pingcap/tidb/server.(*clientConn).handleStmt(0xc001b8b040, {0x5cc3038, 0xc0059701e0}, {0x5cd6f80, 0xc0057ba200}, {0x0, 0x0, 0x0}, 0x1)
        /home/genius/project/src/github.com/pingcap/tidb/server/conn.go:2107 +0x153
github.com/pingcap/tidb/server.(*clientConn).handleQuery(0xc001b8b040, {0x5cc3038, 0xc0059701e0}, {0xc005974001, 0x2a})
        /home/genius/project/src/github.com/pingcap/tidb/server/conn.go:1898 +0x9a5
github.com/pingcap/tidb/server.(*clientConn).dispatch(0xc001b8b040, {0x5cc3000?, 0xc0018810b0?}, {0xc005974000, 0x2b, 0x2b})
        /home/genius/project/src/github.com/pingcap/tidb/server/conn.go:1385 +0x1035
github.com/pingcap/tidb/server.(*clientConn).Run(0xc001b8b040, {0x5cc3000, 0xc0018810b0})
        /home/genius/project/src/github.com/pingcap/tidb/server/conn.go:1166 +0x28e
github.com/pingcap/tidb/server.(*Server).onConn(0xc001d8e000?, 0xc001b8b040)
        /home/genius/project/src/github.com/pingcap/tidb/server/server.go:677 +0x7e5
created by github.com/pingcap/tidb/server.(*Server).startNetworkListener in goroutine 753
        /home/genius/project/src/github.com/pingcap/tidb/server/server.go:491 +0x77f
[2024/04/17 12:08:38.893 +08:00] [INFO] [tidb.go:285] ["rollbackTxn called due to ddl/autocommit failure"]

@tiancaiamao tiancaiamao added the component/dumpling This is related to Dumpling of TiDB. label Apr 17, 2024
@jackysp
Copy link
Member Author

jackysp commented Apr 17, 2024

@tiancaiamao Dumpling doesn't export _tidb_rowid, _tidb_rowid is generated by lightning, I think it is OK if the value is changed after importing.

@jackysp
Copy link
Member Author

jackysp commented Apr 17, 2024

err = alloc.Rebase(ctx, newBase, false)
In the lightning log, two allocators rebase the same value. Just change the second base to rowid and it will be fine.

RebaseGlobalAutoID 1778961125641936900 &{test.t 0xc000d288a0 0xc000952280 0xc000bf8300 0xc0006668c0 {false [0xc000ed87a0 0xc000ed87c0 0xc000ed87 e0]} {0xc001246c40} 0xc0001efb00 0xc00100b180 map[]} 2 &{100 t utf8mb4 utf8mb4_bin [0xc000de89a0] [0xc000f894d0] [] [] public false false 0 1778961125641936899 0 1 0 1 1 0 0 449145916100968471 0 4 4 0 0 3 5 false disable }
RebaseGlobalAutoID 1778961125641936900 &{test.t 0xc000d288a0 0xc000952280 0xc000bf8300 0xc0006668c0 {false [0xc000ed87a0 0xc000ed87c0 0xc000ed87 e0]} {0xc001246c40} 0xc0001efb00 0xc00100b180 map[]} 2 &{100 t utf8mb4 utf8mb4_bin [0xc000de89a0] [0xc000f894d0] [] [] public false false 0 1778961125641936899 0 1 0 1 1 0 0 449145916100968471 0 4 4 0 0 3 5 false disable }

see #46171

@jackysp jackysp added component/lightning This issue is related to Lightning of TiDB. and removed component/dumpling This is related to Dumpling of TiDB. labels Apr 17, 2024
@D3Hunter
Copy link
Contributor

D3Hunter commented Apr 17, 2024

it's not related to AUTO_ID_CACHE=1 actually the cause is lightning uses a shared base for all allocators and the value of sepAutoInc is always false which didn't consider AUTO_ID_CACHE=1, it might cause this issue when

  • a table with NONCLUSTERED PK + AUTO_INCREMENT + SHARD_ROW_ID_BITS and have a pk > max allowed row_id increment part
  • a table with NONCLUSTERED PK + AUTO_RANDOM + SHARD_ROW_ID_BITS and have increment part of some pk > max allowed row_id increment part
    sharedBase := &base
    return autoid.NewAllocators(
    false,
    &panickingAllocator{base: sharedBase, ty: autoid.RowIDAllocType},
    &panickingAllocator{base: sharedBase, ty: autoid.AutoIncrementType},
    &panickingAllocator{base: sharedBase, ty: autoid.AutoRandomType},

@D3Hunter D3Hunter changed the title Compatible issue when using lightning + SHARD_ROW_ID_BITS + AUTO_ID_CACHE=1 _tidb_rowid overflow when write to a table with nonclustered pk + SHARD_ROW_ID_BITS + large auto PK Apr 17, 2024
@D3Hunter D3Hunter changed the title _tidb_rowid overflow when write to a table with nonclustered pk + SHARD_ROW_ID_BITS + large auto PK _tidb_rowid overflow when write to a lightning imported table with nonclustered pk + SHARD_ROW_ID_BITS + large auto PK Apr 17, 2024
@D3Hunter D3Hunter changed the title _tidb_rowid overflow when write to a lightning imported table with nonclustered pk + SHARD_ROW_ID_BITS + large auto PK Compatible issue when using lightning + SHARD_ROW_ID_BITS + AUTO_ID_CACHE=1 Apr 17, 2024
@D3Hunter D3Hunter removed may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Apr 18, 2024
@D3Hunter D3Hunter self-assigned this Apr 18, 2024
ti-chi-bot bot pushed a commit that referenced this issue Apr 18, 2024
ti-chi-bot bot pushed a commit that referenced this issue Apr 18, 2024
ti-chi-bot bot pushed a commit that referenced this issue May 21, 2024
ti-chi-bot bot pushed a commit that referenced this issue Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 affects-7.5 affects-8.1 component/lightning This issue is related to Lightning of TiDB. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
6 participants