Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix races from clientv3/integration tests #4876

Merged
merged 8 commits into from
Mar 29, 2016

Conversation

heyitsanthony
Copy link
Contributor

================
WARNING: DATA RACE
Write by goroutine 146:
  github.com/coreos/etcd/clientv3.(*kv).switchRemote()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/kv.go:194 +0x158

Previous read by goroutine 307:
  github.com/coreos/etcd/clientv3.(*kv).switchRemote()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/kv.go:186 +0x63
  github.com/coreos/etcd/clientv3.(*kv).Do()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/kv.go:179 +0x490
  github.com/coreos/etcd/clientv3.(*kv).Get()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/kv.go:103 +0xdb
  github.com/coreos/etcd/clientv3/integration.TestKVPutFailGetRetry.func1()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/kv_test.go:437 +0xa1

Goroutine 146 (running) created at:
  github.com/coreos/etcd/clientv3.(*kv).Do()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/kv.go:175 +0x41e
  github.com/coreos/etcd/clientv3.(*kv).Put()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/kv.go:98 +0xf5
  github.com/coreos/etcd/clientv3/integration.TestKVPutFailGetRetry()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/kv_test.go:429 +0x4c0
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:473 +0xdc

Goroutine 307 (running) created at:
  github.com/coreos/etcd/clientv3/integration.TestKVPutFailGetRetry()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/kv_test.go:445 +0x5a6
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:473 +0xdc
=================
==================
WARNING: DATA RACE
Write by goroutine 63:
  runtime.mapassign1()
      /usr/lib/go/src/runtime/hashmap.go:429 +0x0
  github.com/coreos/etcd/rafthttp.(*Transport).AddRemote()
      /home/anthony/go/src/github.com/coreos/etcd/rafthttp/transport.go:220 +0x525
  github.com/coreos/etcd/rafthttp.(*streamHandler).ServeHTTP()
      /home/anthony/go/src/github.com/coreos/etcd/rafthttp/http.go:283 +0xa46
  net/http.(*ServeMux).ServeHTTP()
      /usr/lib/go/src/net/http/server.go:1910 +0x212
  net/http.(*ServeMux).ServeHTTP()
      /usr/lib/go/src/net/http/server.go:1910 +0x212
  github.com/coreos/etcd/pkg/testutil.(*PauseableHandler).ServeHTTP()
      /home/anthony/go/src/github.com/coreos/etcd/pkg/testutil/pauseable_handler.go:33 +0xf8
  net/http.serverHandler.ServeHTTP()
      /usr/lib/go/src/net/http/server.go:2081 +0x206
  net/http.(*conn).serve()
      /usr/lib/go/src/net/http/server.go:1472 +0x1565

Previous read by goroutine 279:
  runtime.mapaccess2_fast64()
      /usr/lib/go/src/runtime/hashmap_fast.go:148 +0x0
  github.com/coreos/etcd/rafthttp.(*Transport).Send()
      /home/anthony/go/src/github.com/coreos/etcd/rafthttp/transport.go:179 +0x331
  github.com/coreos/etcd/etcdserver.(*EtcdServer).send()
      /home/anthony/go/src/github.com/coreos/etcd/etcdserver/server.go:954 +0x5b8
  github.com/coreos/etcd/etcdserver.(*raftNode).start.func1()
      /home/anthony/go/src/github.com/coreos/etcd/etcdserver/raft.go:224 +0xe7d

Goroutine 63 (running) created at:
  net/http.(*Server).Serve()
      /usr/lib/go/src/net/http/server.go:2137 +0x4d1
  net/http/httptest.(*Server).goServe.func1()
      /usr/lib/go/src/net/http/httptest/server.go:237 +0xac

Goroutine 279 (running) created at:
  github.com/coreos/etcd/etcdserver.(*raftNode).start()
      /home/anthony/go/src/github.com/coreos/etcd/etcdserver/raft.go:234 +0x329
  github.com/coreos/etcd/etcdserver.(*EtcdServer).run()
      /home/anthony/go/src/github.com/coreos/etcd/etcdserver/server.go:499 +0x1eb
==================
==================
WARNING: DATA RACE
Read by goroutine 759:
  github.com/coreos/etcd/clientv3.(*watcher).resumeWatchers()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:506 +0x31a
  github.com/coreos/etcd/clientv3.(*watcher).resume()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:470 +0x9e
  github.com/coreos/etcd/clientv3.(*watcher).newWatchClient()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:457 +0x4a
  github.com/coreos/etcd/clientv3.(*watcher).run()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:321 +0xb94

Previous write by goroutine 467:
  github.com/coreos/etcd/clientv3.(*watcher).serveStream()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:412 +0x7d1

Goroutine 759 (running) created at:
  github.com/coreos/etcd/clientv3.NewWatcher()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:145 +0x559
  github.com/coreos/etcd/clientv3/integration.runWatchTest()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:50 +0x11c
  github.com/coreos/etcd/clientv3/integration.TestWatchReconnRunning()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:231 +0x3a
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:473 +0xdc

Goroutine 467 (running) created at:
  github.com/coreos/etcd/clientv3.(*watcher).addStream()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:244 +0x496
  github.com/coreos/etcd/clientv3.(*watcher).run()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/watch.go:295 +0x802
==================
==================
WARNING: DATA RACE
Write by goroutine 953:
  github.com/coreos/etcd/clientv3/integration.testWatchWithProgressNotify()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:390 +0x1d7
  github.com/coreos/etcd/clientv3/integration.TestWatchWithProgressNotify()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:376 +0x33
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:473 +0xdc

Previous read by goroutine 699:
  [failed to restore the stack]

Goroutine 953 (running) created at:
  testing.RunTests()
      /usr/lib/go/src/testing/testing.go:582 +0xae2
  testing.(*M).Run()
      /usr/lib/go/src/testing/testing.go:515 +0x11d
  github.com/coreos/etcd/clientv3/integration.TestMain()
      /home/anthony/go/src/github.com/coreos/etcd/clientv3/integration/main_test.go:15 +0x2e
  main.main()
      github.com/coreos/etcd/clientv3/integration/_test/_testmain.go:114 +0x20a

Goroutine 699 (finished) created at:
  github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch()
      /home/anthony/go/src/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:93 +0x2a4
  github.com/coreos/etcd/etcdserver/etcdserverpb._Watch_Watch_Handler()
      /home/anthony/go/src/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:1834 +0xfe
  github.com/coreos/etcd/vendor/google.golang.org/grpc.(*Server).processStreamingRPC()
      /home/anthony/go/src/github.com/coreos/etcd/vendor/google.golang.org/grpc/server.go:586 +0x647
  github.com/coreos/etcd/vendor/google.golang.org/grpc.(*Server).handleStream()
      /home/anthony/go/src/github.com/coreos/etcd/vendor/google.golang.org/grpc/server.go:661 +0x150a
  github.com/coreos/etcd/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1()
      /home/anthony/go/src/github.com/coreos/etcd/vendor/google.golang.org/grpc/server.go:323 +0xad
==================

@heyitsanthony heyitsanthony force-pushed the integration-races branch 2 times, most recently from b08eb73 to 2a2b4e9 Compare March 27, 2016 10:40
@@ -165,10 +165,11 @@ func (t *Transport) Send(msgs []raftpb.Message) {
to := types.ID(m.To)

t.mu.RLock()
p, ok := t.peers[to]
p, pok := t.peers[to]
g, rok := t.remotes[to]
t.mu.RUnlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Would like to test this code to see if we still get #4855.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, I still see failures even with this. Will investigate more.

@gyuho
Copy link
Contributor

gyuho commented Mar 28, 2016

lgtm. defer to @xiang90

@@ -183,14 +183,14 @@ func (kv *kv) Do(ctx context.Context, op Op) (OpResponse, error) {
}

func (kv *kv) switchRemote(prevErr error) error {
kv.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not a good idea to hold the lock while doing network io i think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK here since switchRemote only happens on connection failure; IO would have to wait for a new connection anyway and canceling the client context can break the lock if it stalls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. add a comment for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@xiang90
Copy link
Contributor

xiang90 commented Mar 28, 2016

lgtm

@heyitsanthony
Copy link
Contributor Author

merging on greenlight

@heyitsanthony
Copy link
Contributor Author

semaphore found a new one, will investigate

==================
WARNING: DATA RACE
Write by goroutine 386:
  github.com/coreos/etcd/clientv3.newClient()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/client.go:196 +0xcf6
  github.com/coreos/etcd/clientv3.New()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/client.go:90 +0x122
  github.com/coreos/etcd/integration.NewClientV3()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/integration/cluster.go:504 +0x33e
  github.com/coreos/etcd/integration.NewClusterV3()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/integration/cluster.go:724 +0x194
  github.com/coreos/etcd/clientv3/integration.runWatchTest()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:47 +0xc1
  github.com/coreos/etcd/clientv3/integration.TestWatchReconnInit()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:216 +0x3a
  testing.tRunner()
      /tmp/workdir/go/src/testing/testing.go:456 +0xdc

Previous read by goroutine 106:
  [failed to restore the stack]

Goroutine 386 (running) created at:
  testing.RunTests()
      /tmp/workdir/go/src/testing/testing.go:561 +0xaa3
  testing.(*M).Run()
      /tmp/workdir/go/src/testing/testing.go:494 +0xe4
  github.com/coreos/etcd/clientv3/integration.TestMain()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/main_test.go:15 +0x2e
  main.main()
      github.com/coreos/etcd/clientv3/integration/_test/_testmain.go:114 +0x209

Goroutine 106 (finished) created at:
  github.com/coreos/etcd/vendor/google.golang.org/grpc.(*Server).Serve()
      /home/runner/workspace/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/vendor/google.golang.org/grpc/server.go:261 +0x437
==================

@heyitsanthony
Copy link
Contributor Author

New race is from #4875. I think the call to SetLogger is broken-- the grpc docs say "Call only from init() functions." /cc @xiang90 @gyuho

@gyuho
Copy link
Contributor

gyuho commented Mar 28, 2016

That means we cannot configure in client side. Maybe we should just disable it in init function?

@gyuho
Copy link
Contributor

gyuho commented Mar 28, 2016

Unless it's being done in etcdctl, then think we can make it configurable in init function?

@heyitsanthony
Copy link
Contributor Author

@gyuho I think we can configure it in the client but we'd have to slip in a custom logger on init() that would let us dynamically change the base logger.

@heyitsanthony
Copy link
Contributor Author

Reworked the client logger code ptal /cc @gyuho @xiang90

func init() {
logger = settableLogger{Logger: log.New(ioutil.Discard, "", 0)}
// disable client side grpc by default
grpclog.SetLogger(&logger)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. defer to @xiang90

@heyitsanthony heyitsanthony force-pushed the integration-races branch 3 times, most recently from b98aa67 to 0b44726 Compare March 29, 2016 06:19
grpc only permits SetLogger on init()
@gyuho
Copy link
Contributor

gyuho commented Mar 29, 2016

lgtm.

@xiang90
Copy link
Contributor

xiang90 commented Mar 29, 2016

lgtm

@heyitsanthony heyitsanthony merged commit d533c14 into etcd-io:master Mar 29, 2016
@heyitsanthony heyitsanthony deleted the integration-races branch March 29, 2016 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants