statistics: ease the impact of stats feedback on cluster (#15503) #18769

ti-srebot · 2020-07-24T08:39:15Z

cherry-pick #15503 to release-2.1

What problem does this PR solve?

Problem Summary:

Statistics feedback would impose periodical read/write burden on the database. Each TiDB would dump the feedbacks collected on this instance into TiKV every 10 mins, and the stats owner TiDB instance would read the feedbacks dumped every 15 seconds. If the stats owner TiDB finds there are new feedbacks from TiKV, it would merge the feedbacks with statistics in cache, then dump all these updated statistics into TiKV. This dump operation is pretty heavy if there are bunches of feedbacks on bunches of columns/indexes, since it can be treated as a light-weight ANALYZE on a lot of tables.

What is changed and how it works?

What's Changed:

First, reduce the amount of feedbacks generated on each TiDB by:

decreasing default MaxQueryFeedbackCount;
discarding feedbacks which have too small error rate, or too small scanned row count;
discarding feedbacks which have overlapped ranges on the same index/column;

~~Second, merge multiple insert/update/delete statements of dumping statistics into single ones, to reduce the unnecessary function call stacks and RPCs.~~

How it Works:

Obviously, the first change can flow-control the statistics feedback mechanism fundamentally, but we may lose some stats accuracy incurred by feedback theoretically. The second change combines several small transactions into a big one, since we have controlled the amount of feedbacks using the first change, I guess this bigger transaction is supposed not to be a problem. However, the bigger transaction should have higher chances of write conflict, and it makes code harder to read, so I haven't made up my mind to keep it or not actually.

Related changes

Need to cherry-pick to the release branch: if this PR is experimentally effective, we should apply it in release branches.

Check List

Tests

Unit test
Manual test (add detailed scripts or steps below): perf test result

Side effects

Performance regression
- Possible stats accurateness lose may cause potential query performance regression.

Release note

Ease the impact of stats feedback on cluster

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

ti-srebot · 2020-07-24T08:39:17Z

/run-all-tests

ti-srebot · 2020-07-24T08:39:24Z

@eurekaka please accept the invitation then you can push to the cherry-pick pull requests.
https://github.com/ti-srebot/tidb/invitations

eurekaka · 2020-07-24T08:57:23Z

2.1 has too many conflicts with master / 4.0, close this PR.

cherry pick pingcap#15503 to release-2.1

57bddbc

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

ti-srebot mentioned this pull request Jul 24, 2020

statistics: ease the impact of stats feedback on cluster #15503

Merged

ti-srebot added sig/execution SIG execution component/statistics epic/query-feedback-GA type/2.1-cherry-pick type/enhancement The issue or PR belongs to an enhancement. labels Jul 24, 2020

ti-srebot requested review from lzmhhh123, qw4990, winoros and zz-jason July 24, 2020 08:39

ti-srebot assigned eurekaka Jul 24, 2020

github-actions bot added the component/config label Jul 24, 2020

eurekaka closed this Jul 24, 2020

eurekaka deleted the release-2.1-a99fdc098cb3 branch July 24, 2020 08:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

statistics: ease the impact of stats feedback on cluster (#15503) #18769

statistics: ease the impact of stats feedback on cluster (#15503) #18769

ti-srebot commented Jul 24, 2020

ti-srebot commented Jul 24, 2020

ti-srebot commented Jul 24, 2020

eurekaka commented Jul 24, 2020 •

edited

Loading

statistics: ease the impact of stats feedback on cluster (#15503) #18769

statistics: ease the impact of stats feedback on cluster (#15503) #18769

Conversation

ti-srebot commented Jul 24, 2020

What problem does this PR solve?

What is changed and how it works?

Related changes

Check List

Release note

ti-srebot commented Jul 24, 2020

ti-srebot commented Jul 24, 2020

eurekaka commented Jul 24, 2020 • edited Loading

eurekaka commented Jul 24, 2020 •

edited

Loading