storage: Replica gets GC'ed immediately after preemptive snapshot gets applied #17198

a-robinson · 2017-07-24T20:39:03Z

On node 4 (the leaseholder):

I170724 20:12:25.899977 383 storage/replica_raftstorage.go:511  [replicate,n4,s4,r24/4:/Table/53{-/1/-4621…}] generated preemptive snapshot 2a6aed34 at index 1063935
...
I170724 20:12:44.209108 383 storage/store.go:3506  [replicate,n4,s4,r24/4:/Table/53{-/1/-4621…}] streamed snapshot to (n3,s3):?: kv pairs: 383945, log entries: 3946, rate-limit: 2.0 MiB/sec, 18309ms
...
E170724 20:13:25.900274 383 storage/queue.go:658  [replicate,n4,s4,r24/4:/Table/53{-/1/-4621…}] change replicas of r24 failed: failed to send RPC: sending to all 3 replicas failed; last error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
W170724 20:13:25.900444 5086530 storage/replica.go:2029  [n4,s4,r24/4:/Table/53{-/1/-4621…}] context deadline exceeded while in command queue: [txn: c806e266], Get [/Local/Range/"\xbd"/RangeDescriptor,/Min)

On node 3 (where the preemptive snapshot was being sent):

I170724 20:12:44.301466 910435 storage/replica_raftstorage.go:720  [n3,s3,r24/?:{-}] applying preemptive snapshot at index 1063935 (id=2a6aed34, encoded size=40589438, 147 rocksdb batches, 3946 log entries)
W170724 20:12:45.602906 910435 storage/engine/rocksdb.go:1490  batch [387893/40749548/4] commit took 506.740038ms (>500ms):
...
I170724 20:12:46.044889 910435 storage/replica_raftstorage.go:728  [n3,s3,r24/?:/Table/53{-/1/-4621…}] applied preemptive snapshot in 1743ms [clear=411ms batch=366ms entries=17ms commit=948ms]
I170724 20:12:46.683553 349 storage/store.go:2201  [replicaGC,n3,s3,r24/?:/Table/53{-/1/-4621…}] removing replica
I170724 20:12:46.703259 349 storage/replica.go:749  [replicaGC,n3,s3,r24/?:/Table/53{-/1/-4621…}] removed 383935 (383921+14) keys in 19ms [clear=3ms commit=17ms]

This is shortly after starting up kv on indigo, where the round-trip latency is roughly 30-40ms between data centers. It's concerning both that the replica was immediately GC'ed and that once the snapshot has been successfully applied that the leaseholder apparently can't figure out that it needs to be resent if the remote replica gets GC'ed.

The text was updated successfully, but these errors were encountered:

bdarnell · 2017-07-24T20:59:55Z

Is this repeatable? It's a known issue that this is possible, but it should be rare and we're supposed to recover by sending a raft-initiated snapshot.

a-robinson · 2017-07-24T21:06:57Z

I've seen it twice while messing around with indigo this afternoon, so it's probably pretty repeatable. I haven't yet tried to experiment with it again though.

In at least second case, the range got stuck afterwards, so there's something that needs to be fixed here.

nvanbenschoten · 2017-09-05T16:41:41Z

Interestingly, when denim became wedged last week, we saw this 8 times within the same minute that progress halted (all on n6). I don't have any evidence that it's related, but it's an interesting correlation that's worth further investigation.

The ranges affected by this issue were:

m-schneider · 2018-04-17T14:15:47Z

I was able to reproduce this by running a cluster built from a commit in July, running KV and a cron job that changed zone configs every 10 minutes to make sure rebalancing was constantly happening. It wasn't reproducable on master. Then I tested a commit before and after #19353 and it looks like that PR fixed this issue as well.

a-robinson added this to the 1.1 milestone Jul 24, 2017

a-robinson self-assigned this Jul 24, 2017

a-robinson mentioned this issue Jul 26, 2017

roachtest: resurrect 1->6 allocator test #17240

Closed

a-robinson mentioned this issue Aug 7, 2017

production: Admin UI not displaying any data on indigo #17478

Closed

cuongdo assigned nvanbenschoten and unassigned a-robinson Aug 17, 2017

petermattis modified the milestones: 1.1, 2.0 Jan 19, 2018

tbg assigned m-schneider Mar 26, 2018

petermattis modified the milestones: 2.0, 2.1 Apr 5, 2018

m-schneider closed this as completed Apr 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: Replica gets GC'ed immediately after preemptive snapshot gets applied #17198

storage: Replica gets GC'ed immediately after preemptive snapshot gets applied #17198

a-robinson commented Jul 24, 2017

bdarnell commented Jul 24, 2017

a-robinson commented Jul 24, 2017

nvanbenschoten commented Sep 5, 2017

m-schneider commented Apr 17, 2018

storage: Replica gets GC'ed immediately after preemptive snapshot gets applied #17198

storage: Replica gets GC'ed immediately after preemptive snapshot gets applied #17198

Comments

a-robinson commented Jul 24, 2017

bdarnell commented Jul 24, 2017

a-robinson commented Jul 24, 2017

nvanbenschoten commented Sep 5, 2017

m-schneider commented Apr 17, 2018