Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg(ticdc): add a faster raw string rendering version of GenUpdateSQL #8069

Merged
merged 6 commits into from
Jan 18, 2023

Conversation

amyangfei
Copy link
Contributor

@amyangfei amyangfei commented Jan 12, 2023

What problem does this PR solve?

Issue Number: ref #8057
ref #8084

Task-2

What is changed and how it works?

  • Use raw string format to generate update sql.
  • Note this PR doesn't replace GenUpdate with GenUpdateFast, we will use GenUpdateFast after more correctness check and test.
  • Benchmark result is as following, the CPU costs decreases to 10%-16%, memory allocation decreases to 5%-11%. (Horizontal axis denotes the batch size of RowChange)
CPU time memory allocation
go test -run='^$' -benchmem -bench '^(BenchmarkGenUpdate)$' github.com/pingcap/tiflow/pkg/sqlmodel
goos: darwin
goarch: arm64
pkg: github.com/pingcap/tiflow/pkg/sqlmodel
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch1-8              162788              7273 ns/op            8145 B/op        167 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch2-8              114633             10409 ns/op           10034 B/op        227 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch4-8               76413             15544 ns/op           13747 B/op        346 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch8-8               46393             25777 ns/op           20724 B/op        583 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch16-8              25616             46324 ns/op           33784 B/op       1056 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch32-8              13632             87381 ns/op           62977 B/op       2002 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch64-8               6699            169962 ns/op          128919 B/op       3893 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQL-Batch128-8              3602            331583 ns/op          246086 B/op       7671 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch1-8          914386              1224 ns/op            1712 B/op         20 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch2-8          789146              1451 ns/op            2168 B/op         23 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch4-8          615506              1810 ns/op            3160 B/op         29 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch8-8          439963              2655 ns/op            5056 B/op         41 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch16-8         229112              5016 ns/op           12570 B/op         67 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch32-8         123585              9502 ns/op           27069 B/op        117 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch64-8          61982             19472 ns/op           63620 B/op        216 allocs/op
BenchmarkGenUpdate/OneColumnPK-GenUpdateSQLFast-Batch128-8         32689             36418 ns/op          122018 B/op        410 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch1-8           159885              7308 ns/op            8145 B/op        167 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch2-8           114904             10370 ns/op           10034 B/op        227 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch4-8            77086             15542 ns/op           13747 B/op        346 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch8-8            46633             25813 ns/op           20724 B/op        583 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch16-8           25890             46306 ns/op           33784 B/op       1056 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch32-8           13792             87409 ns/op           62976 B/op       2002 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch64-8            6789            169864 ns/op          128918 B/op       3893 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQL-Batch128-8           3518            340506 ns/op          246086 B/op       7671 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch1-8       958459              1223 ns/op            1712 B/op         20 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch2-8       781336              1457 ns/op            2168 B/op         23 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch4-8       653800              1812 ns/op            3160 B/op         29 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch8-8       447306              2660 ns/op            5056 B/op         41 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch16-8      236116              5015 ns/op           12570 B/op         67 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch32-8      125460              9503 ns/op           27069 B/op        117 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch64-8       62330             19395 ns/op           63620 B/op        216 allocs/op
BenchmarkGenUpdate/MultiColumnsPK-GenUpdateSQLFast-Batch128-8              32689             36641 ns/op          122018 B/op        410 allocs/op
PASS
ok      github.com/pingcap/tiflow/pkg/sqlmodel  44.062s

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

None

@amyangfei amyangfei added the area/ticdc Issues or PRs related to TiCDC. label Jan 12, 2023
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jan 12, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • asddongmen
  • hi-rustin

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 12, 2023
@asddongmen asddongmen self-requested a review January 13, 2023 02:28
@Rustin170506
Copy link
Member

/cc

// Input `changes` should have same target table and same columns for WHERE
// (typically same PK/NOT NULL UK), otherwise the behaviour is undefined.
// It is a faster version compared with GenUpdateSQL.
func GenUpdateSQLFast(changes ...*RowChange) (string, []interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the batch size be limited ?

Copy link
Contributor Author

@amyangfei amyangfei Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have summited a related issue #8084, and this limit will be added in another PR.

@amyangfei amyangfei added the status/ptal Could you please take a look? label Jan 17, 2023
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 18, 2023
Copy link
Member

@Rustin170506 Rustin170506 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM. Thanks!

pkg/sqlmodel/multirow.go Outdated Show resolved Hide resolved
// It is a faster version compared with GenUpdateSQL.
func GenUpdateSQLFast(changes ...*RowChange) (string, []interface{}) {
if len(changes) == 0 {
log.L().DPanic("row changes is empty")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we always use log.Panic() in TiCDC. Why do we need to get the current logger here? I'm not quite familiar with this API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It keeps the same DPanic usage as GenUpdateSQL, since this function is also used in DM. If we don't panic here(not in development mode), row changes will be lost.
@lance6716 Is it ok to change to log.Panic() here for DM?

Copy link
Contributor

@lance6716 lance6716 Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK to use log.Panic.

@hi-rustin in DM when the log level is debug, we turn on Development flag.

tiflow/dm/pkg/log/log.go

Lines 107 to 118 in 171e21b

inDev := strings.ToLower(cfg.Level) == "debug"
// init DM logger
logger, props, err := pclog.InitLogger(&pclog.Config{
Level: cfg.Level,
Format: cfg.Format,
File: pclog.FileLogConfig{
Filename: cfg.File,
MaxSize: cfg.FileMaxSize,
MaxDays: cfg.FileMaxDays,
MaxBackups: cfg.FileMaxBackups,
},
Development: inDev,

and DM integration tests always use debug level log, so we can detect the misuse in developement and control the blast radius when the bug is really happened at user's side. But it's not important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to log.Panic()

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jan 18, 2023
@amyangfei
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 89b7b23

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 18, 2023
@ti-chi-bot
Copy link
Member

@amyangfei: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

trigger some heavy tests which will not run always when PR updated.

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@amyangfei
Copy link
Contributor Author

/run-engine-integration-test

@ti-chi-bot ti-chi-bot merged commit 7dddcd0 into pingcap:master Jan 18, 2023
@amyangfei amyangfei deleted the optimize-gen-update-3 branch April 3, 2023 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants