-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TiDB Down Occurs When TiDB Sets tiflash Replica During Upgrade #57863
Comments
/severity major |
/label affects-8.5 |
Stuck after Found Also found Lines 3717 to 3738 in 0e8ccbc
Found job is submited by tidb-1(problem server) before restart, and then it is executed by upgraded tidb-1. Note that test server has
|
/assign @tangenta |
Deadlock goroutines:
During the upgrade, the TiDB will start When |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
2. What did you expect to see? (Required)
upgrade success
3. What did you see instead (Required)
[2024/12/02 10:14:20.198 +08:00] [ERROR] [advancer.go:400] ["listen task meet error, would reopen."] [error=EOF]
[2024/12/02 10:14:20.323 +08:00] [ERROR] [tso_client.go:366] ["[tso] update connection contexts failed"] [dc=global] [error="rpc error: code = Canceled desc = context canceled"]
[2024/12/02 10:19:37.115 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.217 +08:00] [ERROR] [controller.go:623] ["[resource group controller] token bucket rpc error"] [error="rpc error: code = Unavailable desc = not leader"]
[2024/12/02 10:19:37.220 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.271 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.324 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.375 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.427 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.531 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.573 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:37.634 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.738 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.842 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:37.970 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:19:38.037 +08:00] [ERROR] [client.go:252] ["[pd] request failed with a non-200 status"] [caller-id=pd-http-client] [name=GetMinResolvedTSByStoresIDs] [uri=/pd/api/v1/min-resolved-ts] [method=GET] [target-url=] [source=tikv-driver] [url=http://pd-1-peer:2379/pd/api/v1/min-resolved-ts] [status="500 Internal Server Error"] [body="[PD:apiutil:ErrRedirectToNotLeader]redirect to not leader"]
[2024/12/02 10:19:38.074 +08:00] [ERROR] [pd_service_discovery.go:560] ["[pd] failed to update member"] [urls="[http://pd-1-peer:2379,http://pd-2-peer:2379,http://pd-3-peer:2379]"] [error="[PD:client:ErrClientGetMember]get member failed"]
[2024/12/02 10:19:38.074 +08:00] [ERROR] [tso_dispatcher.go:438] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-url=http://pd-1-peer:2379] [error="[PD:client:ErrClientGetTSO]get TSO failed, rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"]
[2024/12/02 10:21:09.936 +08:00] [ERROR] [http_status.go:531] ["start status/rpc server error"] [error="accept tcp [::]:10080: use of closed network connection"]
[2024/12/02 10:21:09.936 +08:00] [ERROR] [http_status.go:526] ["http server error"] [error="http: Server closed"]
[2024/12/02 10:21:09.936 +08:00] [ERROR] [http_status.go:521] ["grpc server error"] [error="mux: server closed"]
[2024/12/02 10:21:09.947 +08:00] [ERROR] [advancer.go:400] ["listen task meet error, would reopen."] [error=EOF]
[2024/12/02 10:21:21.774 +08:00] [ERROR] [session.go:3935] ["set system variable max_allowed_packet error"] [error="[variable:1621]SESSION variable 'max_allowed_packet' is read-only. Use SET GLOBAL to assign the value"]
[2024/12/02 10:21:25.985 +08:00] [ERROR] [ddl_tiflash_api.go:538] ["updating TiFlash replica status err"] [category=ddl] [error="context canceled"] [tableID=131] [isPartition=true]
4. What is your TiDB version? (Required)
v8.4.0
The text was updated successfully, but these errors were encountered: