Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

”write to tikv with no leader returned“ if we meet NotLeader which is caused by on-going schedule #43055

Closed
D3Hunter opened this issue Apr 14, 2023 · 4 comments · Fixed by #43079
Assignees
Labels
affects-6.1 affects-6.5 component/lightning This issue is related to Lightning of TiDB. severity/moderate type/bug The issue is confirmed as a bug.

Comments

@D3Hunter
Copy link
Contributor

D3Hunter commented Apr 14, 2023

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

it's a corner case, and not easy to reproduce in normal run, even with very large mount of data, below is the timeline which may trigger this case:

  • before we pause scheduler by key-range for a region, there's ongoing schedule which'll not be interrupted by pause
  • lightning start import, but got NotLeader from tikv when ingest, and the ongoing schedule changes leader and peer-list at same time, but lightning only update leader on this case:
  • then on retry write we fail on this error
    if leaderID == region.Peers[i].GetId() {
    leaderPeerMetas = resp.Metas

    if len(leaderPeerMetas) == 0 {
    log.FromContext(ctx).Warn("write to tikv no leader",
    logutil.Region(region), logutil.Leader(j.region.Leader),
    zap.Uint64("leader_id", leaderID), logutil.SSTMeta(meta),
    zap.Int64("kv_pairs", totalCount), zap.Int64("total_bytes", totalSize))
    return errors.Errorf("write to tikv with no leader returned, region '%d', leader: %d",

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

@D3Hunter D3Hunter added the type/bug The issue is confirmed as a bug. label Apr 14, 2023
@D3Hunter
Copy link
Contributor Author

cc @lance6716

@D3Hunter
Copy link
Contributor Author

5.4 don't has this problem since writeAndIngestByRange always retry on error even non-retriable

@lance6716
Copy link
Contributor

but got NotLeader from tikv

lightning get this error when write or ingest?

@D3Hunter
Copy link
Contributor Author

on ingest. updated

@lance6716 lance6716 self-assigned this Apr 14, 2023
@niubell niubell added the component/lightning This issue is related to Lightning of TiDB. label Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.1 affects-6.5 component/lightning This issue is related to Lightning of TiDB. severity/moderate type/bug The issue is confirmed as a bug.
Projects
None yet
3 participants