Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coordinator] Enable running M3 coordinator without Etcd #3814

Merged
merged 16 commits into from
Oct 6, 2021

Conversation

Antanukas
Copy link
Collaborator

@Antanukas Antanukas commented Oct 5, 2021

What this PR does / why we need it:

This enables running M3 Coordinator with in process M3 Aggregator without external Etcd.

Special notes for your reviewer:

Situation until now:

  • When running with M3DB backend embedded Etcd would be provided by DB node.
  • Running noop-etcd actually makes sense with external Etcd since this is a so called "admin mode"
  • Running gRPC backend would not setup downsampler and Etcd is not required
  • When running promRemote Etcd is required since downsampler is initialized. And by default it's an in process one.

With this change it's now possible to run in process aggregator (downsampler) without external Etcd.

Another fix is to initialize /namespaces key if it's not set. Otherwise startup takes 5s more because we are waiting for watcher to timeout.

Does this PR introduce a user-facing and/or backwards incompatible change?:

None

Does this PR require updating code package or user-facing documentation?:

None

@Antanukas Antanukas changed the title [coordinator] enable running M3 coordinator without Etcd [coordinator] Enable running M3 coordinator without Etcd Oct 5, 2021
@@ -978,6 +987,14 @@ func (cfg Configuration) newAggregator(o DownsamplerOptions) (agg, error) {
}, nil
}

func initStoreNamespaces(store kv.Store, nsKey string) error {
_, err := store.SetIfNotExists(nsKey, &rulepb.Namespaces{})
Copy link
Collaborator Author

@Antanukas Antanukas Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a subject to a race condition "set if not exist" when multiple writes might be happening to the store? It's probably a corner case and might be an issue only when Etcd is in uninitialized state.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll get back a kv.ErrAlreadyExists if there is a race. Etcd guarantees consistency since it each transaction uses raft consensus, so only one will win the race and the others will get back kv.ErrAlreadyExists.

Comment on lines 857 to 861
if !mem.IsMem(kvStore) {
return agg{}, errors.New("running in process downsampler with other store " +
"then in memory can yield unexpected side effects")
}
localKVStore := kvStore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good protection, great 👍

Copy link
Collaborator

@robskillington robskillington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once tests are passing

@codecov
Copy link

codecov bot commented Oct 5, 2021

Codecov Report

Merging #3814 (39ccce7) into master (17418a2) will decrease coverage by 0.0%.
The diff coverage is n/a.

❗ Current head 39ccce7 differs from pull request most recent head 5bfd9fa. Consider uploading reports for the commit 5bfd9fa to get more accurate results

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #3814     +/-   ##
========================================
- Coverage    56.9%   56.8%   -0.1%     
========================================
  Files         552     552             
  Lines       63079   63079             
========================================
- Hits        35916   35854     -62     
- Misses      23970   24020     +50     
- Partials     3193    3205     +12     
Flag Coverage Δ
aggregator 63.3% <ø> (ø)
cluster ∅ <ø> (∅)
collector 58.4% <ø> (ø)
dbnode 60.4% <ø> (-0.2%) ⬇️
m3em 46.4% <ø> (ø)
metrics 19.7% <ø> (ø)
msg 74.4% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17418a2...5bfd9fa. Read the comment docs.

@Antanukas Antanukas marked this pull request as ready for review October 5, 2021 16:34
Comment on lines 2846 to 2856
ctrl := gomock.NewController(t)
defer ctrl.Finish()
store := kv.NewMockStore(ctrl)
store.EXPECT().SetIfNotExists(gomock.Eq(matcher.NewOptions().NamespacesKey()), gomock.Any()).Return(0, nil).Times(1)
store.EXPECT().Set(gomock.Any(), gomock.Any()).Times(0)
store.EXPECT().CheckAndSet(gomock.Any(), gomock.Any(), gomock.Any()).Times(0)
store.EXPECT().Delete(gomock.Any()).Times(0)
_ = newTestDownsampler(t, testDownsamplerOptions{
remoteClientMock: nil,
kvStore: store,
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: slightly easier to read test code when you separate controller setup / mock setup / invocation with empty lines.

ctrl := gomock.NewController(t)
defer ctrl.Finish()
store := kv.NewMockStore(ctrl)
store.EXPECT().SetIfNotExists(gomock.Eq(matcher.NewOptions().NamespacesKey()), gomock.Any()).Return(0, nil).Times(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for Times(1) here, isn't it the default behavior?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I will remove.

Comment on lines +2850 to +2852
store.EXPECT().Set(gomock.Any(), gomock.Any()).Times(0)
store.EXPECT().CheckAndSet(gomock.Any(), gomock.Any(), gomock.Any()).Times(0)
store.EXPECT().Delete(gomock.Any()).Times(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Times(0) - isn't it the same as not defining those method calls?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By having them here explicitly I am trying to communicate that I am trying to ensure no mutation methods are invoked. And if in the future test fails with unexpected invocation someone will think twice before adding it here. Maybe I need to add comment for that? Or can you suggest some other approach?

Comment on lines +2859 to +2860
func TestDownsamplerNamespacesEtcdInit(t *testing.T) {
t.Run("does not reset namespaces key", func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be cleaner to have separate test methods instead of all those t.Run? Or was your goal to have nice test descriptions? I wish testing.T simply had some kind of SetName for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to group those namespace init related tests. This test file is 3k lines long. So different test methods can just get lost.

assert.Len(t, ns.Namespaces, 0)
})

t.Run("do not initialize namespaces when RequireNamespaceWatchOnInit is true", func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit for consistency:

Suggested change
t.Run("do not initialize namespaces when RequireNamespaceWatchOnInit is true", func(t *testing.T) {
t.Run("does not initialize namespaces when RequireNamespaceWatchOnInit is true", func(t *testing.T) {

Comment on lines 817 to 820
err = initStoreNamespaces(kvStore, matcherOpts.NamespacesKey())
if err != nil {
return agg{}, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
err = initStoreNamespaces(kvStore, matcherOpts.NamespacesKey())
if err != nil {
return agg{}, err
}
if err := initStoreNamespaces(kvStore, matcherOpts.NamespacesKey()); err != nil {
return agg{}, err
}

@@ -538,14 +537,15 @@ func runServer(t *testing.T, opts runServerOpts) (string, closeFn) {
doneCh = make(chan struct{})
listenerCh = make(chan net.Listener, 1)
clusterClient = clusterclient.NewMockClient(opts.ctrl)
clusterClientCh = make(chan clusterclient.Client, 1)
clusterClientCh chan clusterclient.Client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this channel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a away to pass in a mocked instance for cases when Etcd is required. Probably not a best thing to do but thats how it was before. With my change I just added case when no Etcd is needed I pass this channel as nil and expect server startup to suceed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't notice it was there before the change, thought you had introduced it.

Copy link
Collaborator

@linasm linasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Antanukas Antanukas enabled auto-merge (squash) October 6, 2021 08:47
@Antanukas Antanukas merged commit 8dd5622 into master Oct 6, 2021
@Antanukas Antanukas deleted the antanas/run-coord-without-etcd branch October 6, 2021 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants