Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

cherry-pick (#377) to release-4.0 #413

Merged
merged 2 commits into from
Jul 13, 2020

Conversation

YuJuncen
Copy link
Collaborator

What problem does this PR solve?

Currently, DDLs are send to TiDB cluster sequently, if we were the DDL owner, that is fine: we can execute this DDL immediately, and return very fast.

But we are not, and probably cannot. Then things getting bad, we must block and waiting our DDL job pushed to the queue, and executed by owner, then we can send next DDL. Even during waiting time, we can push more DDLs into the DDL job queue.

This PR make GoCreateTabels send create table jobs into DDL queue concurrently.

What is changed and how it works?

we change GoCreateTables and make it use below strategy to create tables:

  1. if provided a dbPool, use this DB pool to execute DDLs concurrently.
  2. if not, roll back to old version: send DDL sequentially.

Check List

Tests

  • Integration test (br_300_small_tables)
  • Manual test
    We test it locally, by a 300 table, per table 100 records workload:
for i in $(seq 0 $1); do
    (echo "CREATE TABLE FOO.sbtest$i(k int primary key, v varchar (255), trailling varchar(1024))" | mysql -P $MYSQL_PORT -h $MYSQL_SERVER -u root &&
        echo "INSERT INTO FOO.sbtest$i(k, v, trailling) VALUES `make_values 100`" | mysql -P $MYSQL_PORT -h $MYSQL_SERVER -u root) 
done

With different concurrency, the result at my computer is:

test-result/1_concurrency
621.19 real        11.08 user         9.64 sys

test-result/4_concurrency
153.06 real         7.50 user         5.97 sys

test-result/8_concurrency
89.42 real         6.48 user         4.97 sys

test-result/16_concurrency
62.90 real         6.47 user         4.82 sys

test-result/baseline (master branch)
605.30 real        11.13 user         9.59 sys

Release Note

  • Speed up restore.

More Things

  • Currently, we make the concurrency of sending DDL the same as cfg.concurrency, which sometimes may be too big and will make many transaction conflicts. Since the execution time of a DDL has no much relative to environment, maybe a fixed value(like 16 or 32) would be good?

* task, restore: send DDLs parallelly

* restore: use moded id to index sessionpool

* store: return nil DB when no TiDB session provided

* *: use isolated DB instead of Session

* *: fix some mis-pushed files :|

* *: rename session pool to DB pool

* restore: rename some sessions to dbs

* restore: add some todos

Co-authored-by: 3pointer <luancheng@pingcap.com>
Copy link
Collaborator

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-srebot ti-srebot added the status/LGT1 LGTM1 label Jul 10, 2020
@ti-srebot
Copy link
Contributor

@kennytm,Thanks for your review.

Copy link
Collaborator

@3pointer 3pointer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot removed the status/LGT1 LGTM1 label Jul 13, 2020
@ti-srebot ti-srebot added the status/LGT2 LGTM2 label Jul 13, 2020
@ti-srebot
Copy link
Contributor

@3pointer,Thanks for your review.

@3pointer 3pointer merged commit 00ea20a into pingcap:release-4.0 Jul 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants