-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[coordinator] Enable running M3 coordinator without Etcd #3814
Conversation
@@ -978,6 +987,14 @@ func (cfg Configuration) newAggregator(o DownsamplerOptions) (agg, error) { | |||
}, nil | |||
} | |||
|
|||
func initStoreNamespaces(store kv.Store, nsKey string) error { | |||
_, err := store.SetIfNotExists(nsKey, &rulepb.Namespaces{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a subject to a race condition "set if not exist" when multiple writes might be happening to the store? It's probably a corner case and might be an issue only when Etcd is in uninitialized state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll get back a kv.ErrAlreadyExists
if there is a race. Etcd guarantees consistency since it each transaction uses raft consensus, so only one will win the race and the others will get back kv.ErrAlreadyExists
.
if !mem.IsMem(kvStore) { | ||
return agg{}, errors.New("running in process downsampler with other store " + | ||
"then in memory can yield unexpected side effects") | ||
} | ||
localKVStore := kvStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good protection, great 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once tests are passing
Codecov Report
@@ Coverage Diff @@
## master #3814 +/- ##
========================================
- Coverage 56.9% 56.8% -0.1%
========================================
Files 552 552
Lines 63079 63079
========================================
- Hits 35916 35854 -62
- Misses 23970 24020 +50
- Partials 3193 3205 +12
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
ctrl := gomock.NewController(t) | ||
defer ctrl.Finish() | ||
store := kv.NewMockStore(ctrl) | ||
store.EXPECT().SetIfNotExists(gomock.Eq(matcher.NewOptions().NamespacesKey()), gomock.Any()).Return(0, nil).Times(1) | ||
store.EXPECT().Set(gomock.Any(), gomock.Any()).Times(0) | ||
store.EXPECT().CheckAndSet(gomock.Any(), gomock.Any(), gomock.Any()).Times(0) | ||
store.EXPECT().Delete(gomock.Any()).Times(0) | ||
_ = newTestDownsampler(t, testDownsamplerOptions{ | ||
remoteClientMock: nil, | ||
kvStore: store, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: slightly easier to read test code when you separate controller setup / mock setup / invocation with empty lines.
ctrl := gomock.NewController(t) | ||
defer ctrl.Finish() | ||
store := kv.NewMockStore(ctrl) | ||
store.EXPECT().SetIfNotExists(gomock.Eq(matcher.NewOptions().NamespacesKey()), gomock.Any()).Return(0, nil).Times(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for Times(1)
here, isn't it the default behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, I will remove.
store.EXPECT().Set(gomock.Any(), gomock.Any()).Times(0) | ||
store.EXPECT().CheckAndSet(gomock.Any(), gomock.Any(), gomock.Any()).Times(0) | ||
store.EXPECT().Delete(gomock.Any()).Times(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Times(0)
- isn't it the same as not defining those method calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By having them here explicitly I am trying to communicate that I am trying to ensure no mutation methods are invoked. And if in the future test fails with unexpected invocation someone will think twice before adding it here. Maybe I need to add comment for that? Or can you suggest some other approach?
func TestDownsamplerNamespacesEtcdInit(t *testing.T) { | ||
t.Run("does not reset namespaces key", func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be cleaner to have separate test methods instead of all those t.Run
? Or was your goal to have nice test descriptions? I wish testing.T
simply had some kind of SetName
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to group those namespace init related tests. This test file is 3k lines long. So different test methods can just get lost.
assert.Len(t, ns.Namespaces, 0) | ||
}) | ||
|
||
t.Run("do not initialize namespaces when RequireNamespaceWatchOnInit is true", func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit for consistency:
t.Run("do not initialize namespaces when RequireNamespaceWatchOnInit is true", func(t *testing.T) { | |
t.Run("does not initialize namespaces when RequireNamespaceWatchOnInit is true", func(t *testing.T) { |
err = initStoreNamespaces(kvStore, matcherOpts.NamespacesKey()) | ||
if err != nil { | ||
return agg{}, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
err = initStoreNamespaces(kvStore, matcherOpts.NamespacesKey()) | |
if err != nil { | |
return agg{}, err | |
} | |
if err := initStoreNamespaces(kvStore, matcherOpts.NamespacesKey()); err != nil { | |
return agg{}, err | |
} |
@@ -538,14 +537,15 @@ func runServer(t *testing.T, opts runServerOpts) (string, closeFn) { | |||
doneCh = make(chan struct{}) | |||
listenerCh = make(chan net.Listener, 1) | |||
clusterClient = clusterclient.NewMockClient(opts.ctrl) | |||
clusterClientCh = make(chan clusterclient.Client, 1) | |||
clusterClientCh chan clusterclient.Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a away to pass in a mocked instance for cases when Etcd is required. Probably not a best thing to do but thats how it was before. With my change I just added case when no Etcd is needed I pass this channel as nil
and expect server startup to suceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, didn't notice it was there before the change, thought you had introduced it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What this PR does / why we need it:
This enables running M3 Coordinator with in process M3 Aggregator without external Etcd.
Special notes for your reviewer:
Situation until now:
noop-etcd
actually makes sense with external Etcd since this is a so called "admin mode"gRPC
backend would not setup downsampler and Etcd is not requiredpromRemote
Etcd is required since downsampler is initialized. And by default it's an in process one.With this change it's now possible to run in process aggregator (downsampler) without external Etcd.
Another fix is to initialize
/namespaces
key if it's not set. Otherwise startup takes 5s more because we are waiting for watcher to timeout.Does this PR introduce a user-facing and/or backwards incompatible change?:
Does this PR require updating code package or user-facing documentation?: