Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC resolvedTs progress slowly when redo logging is enabled #8074

Closed
fubinzh opened this issue Jan 13, 2023 · 5 comments · Fixed by #8075
Closed

CDC resolvedTs progress slowly when redo logging is enabled #8074

fubinzh opened this issue Jan 13, 2023 · 5 comments · Fixed by #8075
Labels
affects-6.1 affects-6.5 area/ticdc Issues or PRs related to TiCDC. severity/major type/bug The issue is confirmed as a bug.

Comments

@fubinzh
Copy link

fubinzh commented Jan 13, 2023

What did you do?

  • Create TiDB cluster in GCP
  • Create changefeed with redo, storage is GCS
[consistent]
level = "eventual"
storage = "gcs://qa-redo-log/redo-apply-multiple-cf0g92adj5hatumgaddg"` 
  • Run tpcc workload (15:12 - ~16:30)
  • Pause changefeed for about 1 hours, during with redo apply 2 time
  • Workload stop (~16:30)
  • Resume changefeed (~16:30)

What did you expect to see?

Changefeed checkpoint should advance normally.

What did you see instead?

  • Changefeed doesn't advance for an hour after changefeed resume (puller is pulling)
  • After that, sink starts to flush, but throughput is low (about 1024 rows/s)
    image

image

image

Versions of the cluster

TiCDC version (execute cdc version):

bash-5.1# /cdc version
Release Version: v6.5.0-11-g3808d9732-dirty
Git Commit Hash: 3808d9732201b004147628abf4914fd2c5dd0711
Git Branch: release-6.5
UTC Build Time: 2023-01-10 06:19:02
Go Version: go version go1.19.1 linux/amd64
Failpoint Build: false
bash-5.1#

PD/TiDB/TiKV are v6.5.0

@fubinzh fubinzh added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Jan 13, 2023
@fubinzh
Copy link
Author

fubinzh commented Jan 13, 2023

cdc log: ticdc.log

@fubinzh
Copy link
Author

fubinzh commented Jan 13, 2023

After the error happens, I did another 2 testing:

  1. Update changefeed to local, and resume changefeed => sink flush rows/s still about ~1024
    image

  2. Enable pull based sink, resume changefeed from now, run workload. => sink flush rows/s increases to ~2000
    image

  3. Update changefeed and close redo, resume changefeed. => sink flush rows/s increases to ~20K
    image

@fubinzh
Copy link
Author

fubinzh commented Jan 13, 2023

CPU & Memory Usage:
image

@fubinzh
Copy link
Author

fubinzh commented Jan 13, 2023

/severity Major

@fubinzh
Copy link
Author

fubinzh commented Jan 14, 2023

Reproduced the issue as well, when CDC is configured to use SSD.

image

image

image

ticdc0.log
ticdc1.log

monitor backup: https://snapshots.raintank.io/dashboard/snapshot/h74XS86BjNHyHTvu3u6FkYzO4dTN0ArJ

@CharlesCheung96 CharlesCheung96 changed the title CDC redo log sink advance is slow CDC barrierTs progress slowly when redo logging is enabled Jan 14, 2023
@CharlesCheung96 CharlesCheung96 changed the title CDC barrierTs progress slowly when redo logging is enabled CDC resolvedTs progress slowly when redo logging is enabled Jan 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.1 affects-6.5 area/ticdc Issues or PRs related to TiCDC. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants