Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash when a replica remove itself from raft group #1650

Closed
yananzhi opened this issue Jul 8, 2015 · 3 comments
Closed

crash when a replica remove itself from raft group #1650

yananzhi opened this issue Jul 8, 2015 · 3 comments
Assignees
Labels
help wanted Help is requested / needed by the one who filed the issue to fix it.

Comments

@yananzhi
Copy link
Contributor

yananzhi commented Jul 8, 2015

// TestReadEncounterGroupDeleteError verifies that read command should return
// RangeNotFoundError rather than a ErrGroupDeleted error.
func TestReadEncounterGroupDeleteError(t *testing.T) {
    defer leaktest.AfterTest(t)

    mtc := startMultiTestContext(t, 2)
    defer mtc.Stop()
    raftID := proto.RaftID(1)
    mtc.replicateRange(raftID, 0, 1)
    // Remove the replica from first store.
    mtc.unreplicateRange(raftID, 0, 0)
    getArgs, getResp := getArgs([]byte("a"), raftID, mtc.stores[0].StoreID())

    // Force the read command request a new lease.
    clock := mtc.clocks[0]
    getArgs.Header().Timestamp = clock.Update(clock.Now().Add(int64(storage.DefaultLeaderLeaseDuration), 0))

    // Expect get a RangeNotFoundError
    err := mtc.stores[0].ExecuteCmd(context.Background(), proto.Call{Args: getArgs, Reply: getResp})
    if _, ok := err.(*proto.RangeNotFoundError); !ok {
        t.Fatalf("expect get RangeNotFoundError, actual get %v ", err)
    }
}
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0xbb0a8c]

goroutine 11 [running]:
github.com/coreos/etcd/raft.(*Progress).maybeUpdate(0x0, 0x17, 0x100000001)
    /home/zyn/gopath/src/github.com/coreos/etcd/raft/progress.go:104 +0xc
github.com/coreos/etcd/raft.(*raft).appendEntry(0xc2080ae540, 0xc20812a410, 0x1, 0x1)
    /home/zyn/gopath/src/github.com/coreos/etcd/raft/raft.go:358 +0x111
github.com/coreos/etcd/raft.stepLeader(0xc2080ae540, 0x2, 0x0, 0x100000001, 0x0, 0x0, 0x0, 0xc20812a410, 0x1, 0x1, ...)
    /home/zyn/gopath/src/github.com/coreos/etcd/raft/raft.go:512 +0x2a6
github.com/coreos/etcd/raft.(*raft).Step(0xc2080ae540, 0x2, 0x0, 0x100000001, 0x0, 0x0, 0x0, 0xc20812a410, 0x1, 0x1, ...)
    /home/zyn/gopath/src/github.com/coreos/etcd/raft/raft.go:487 +0x259
github.com/coreos/etcd/raft.(*multiNode).run(0xc208053020)
    /home/zyn/gopath/src/github.com/coreos/etcd/raft/multinode.go:231 +0xbb9
created by github.com/coreos/etcd/raft.StartMultiNode
    /home/zyn/gopath/src/github.com/coreos/etcd/raft/multinode.go:56 +0xb6

A raft leader remove itself from raft group, after the peer was removed from raft, multiNode can still propose command to the raft.

@yananzhi yananzhi added the help wanted Help is requested / needed by the one who filed the issue to fix it. label Jul 8, 2015
@es-chow
Copy link
Contributor

es-chow commented Jul 14, 2015

This issue also mentioned in #768.

@jess-edwards jess-edwards mentioned this issue Aug 18, 2015
78 tasks
@tbg
Copy link
Member

tbg commented Oct 23, 2015

@bdarnell still current?

@bdarnell
Copy link
Contributor

I'm pretty sure this is fixed now but I want to take a closer look at the test case @yananzhi posted and add it to client_raft_test.go if we don't already have something like it.

bdarnell added a commit to bdarnell/cockroach that referenced this issue Nov 17, 2015
sean- added a commit that referenced this issue Jul 21, 2023
038fc448 Release v5.4.2
95aa87f2 exitPotentialWriteReadDeadlock stops bgReader
e0c70201 Skip json format test on CockroachDB
2bf5a614 fix: Do not use infinite timers
1dd69f86 Enable failover efforts when pg_hba.conf disallows non-ssl connections
91cba90e Fix: RowScanner errors are fatal to Rows
74ab538d Release v5.4.1
7c386112 fix concurrency bug in pgtype.defaultMap (#1650)
9a5ead90 Add TxOptions.BeginQuery to allow overriding the default BEGIN query
e5db6a04 pgtype array: Fix encoding of vtab \v
bc8b1ca3 remove the single backing string optimization
2de94187 hstore: Make binary parsing 2X faster
d48d36dc pgtype/hstore: Make text parsing about 6X faster
461b9fa3 Release v5.4.0
90f9aad6 add singleton pgtype.Map for default type mappings
5d4f9018 failed to write startup message error should be normalized
482e56a7 Fix race condition when CopyFrom is cancelled.
3ea2f57d Deprecate CheckConn in favor of Ping
26c79eb2 Handle writes that could deadlock with reads from the server
85136a8e Restore pgx v4 style CopyFrom implementation
4410fc0a Remove nbconn
608f39f4 Ensure pgxpool.Pool.QueryRow.Scan releases connection on panic
4b9aa7c4 chore: update version of golang.org/x/crypto library from v0.6.0 to v0.9.0
0292edec pgx.Conn: Fix memory leak: Delete items from preparedStatements
7f2bb959 add BeforeClose to pgxpool.Pool
f72a147d skip cockroachdb
89475c4c use `atomic.Int32` instead of `int + atomic calls`
b2b4fbcf Set socket to non-blocking mode in `Read`, `Flush` and `BufferReadUntilBlock` operations
3db7d177 Set socket to non-blocking mode before `doneChan` is allocated to avoid that channel leaked in case when `SetBlockingMode` will return error
0dbb0a52 Fix `realNonblockingRead`, set `realNonblockingRead` call error to `nonblockReadErr`

Release note: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Help is requested / needed by the one who filed the issue to fix it.
Projects
None yet
Development

No branches or pull requests

4 participants