Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster support #221

Merged
merged 5 commits into from
Aug 31, 2021
Merged

Cluster support #221

merged 5 commits into from
Aug 31, 2021

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Aug 25, 2021

Fixes #129

We would like to support ovn raft clusters. This means we need:

  • a model for the special _Server database
  • the ability to monitor tables in that database
  • the ability to reject endpoints that are not the leader
  • to drop and reconnect if our endpoint loses leadership

@squeed squeed force-pushed the feature/cluster-support branch from 8707553 to 2ea7818 Compare August 25, 2021 14:26
@squeed
Copy link
Contributor Author

squeed commented Aug 25, 2021

@dave-tucker PTAL. I still need to implement the critical bits, but this is the first start.

client/client.go Outdated
db.monitorsMutex.Lock()
defer db.monitorsMutex.Unlock()
for id, request := range db.monitors {
// TODO: should err here just be treated as failure to connect?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've not encountered a situation where a monitor rpc failed, but it's plausible it could if we were disconnected due to a network hiccup and ovsdb-server hadn't yet closed the monitor session completely - the error would be that there is already a monitor with the same id. in which case, trying again once the state has been torn down would be the best way to fix it I think....

tl;dr i think it's ok to treat this error the same

Copy link
Collaborator

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far @squeed. Just a couple of comments so far

client/client.go Outdated Show resolved Hide resolved
client/client.go Outdated Show resolved Hide resolved
client/client.go Outdated Show resolved Hide resolved
client/client.go Outdated Show resolved Hide resolved
@squeed squeed force-pushed the feature/cluster-support branch from 152beaa to 375a0f1 Compare August 25, 2021 19:42
@squeed squeed changed the title [WIP] Cluster support Cluster support Aug 25, 2021
@squeed
Copy link
Contributor Author

squeed commented Aug 25, 2021

@dave-tucker thanks for the review; I'll work on the wording changes.

I've implemented the _Server watch, so that should be good to go now.

client/client.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have a typo to fix s/betwen/between/g and the server_model package should be renamed serverdb and then the lint should pass 🤞

client/client.go Outdated Show resolved Hide resolved
client/client.go Outdated Show resolved Hide resolved
client/client.go Show resolved Hide resolved
Copy link
Collaborator

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had another look through. As db.schema gets updated on re-connection any read access to it needs to hold an RLock or the race detector gets upset.

@dcbw
Copy link
Contributor

dcbw commented Aug 29, 2021

FWIW, trying to do this downstream in go-ovn via openshift/ovn-kubernetes#695

@squeed squeed force-pushed the feature/cluster-support branch 2 times, most recently from 454041d to 3ad9102 Compare August 30, 2021 11:27
@squeed
Copy link
Contributor Author

squeed commented Aug 30, 2021

FYI, logger support is currently stuck until klog takes a new release, since it moves from logr v0.4 to v1.0. So that will have to wait.

otherwise, @dave-tucker, this is ready to go.

squeed added 5 commits August 30, 2021 14:43
As a part of ovn-org#129, we need to monitor the special "_Server" database so
we can disconnect if a server loses leader.

That means we need (internal) support for multple databases / schemas /
models on the same endpoint.

This implements that, but doesn't expose it to the end-user.

Signed-off-by: Casey Callendrello <cdc@redhat.com>
This table is used by ovsdb itself to report internal status.

Signed-off-by: Casey Callendrello <cdc@redhat.com>
This adds an additional check when testing a potential endpoint: if the
endpoint reports itself as not a leader, then reject it and continue.

This needs to be explicitly requested via a specific Option.

Note that this does not monitor for leadership changes.

Signed-off-by: Casey Callendrello <cdc@redhat.com>
This sets up a monitor on the _Server/Database meta-table, and
disconnects if leadership is lost.

It also adds some useful locking.

Signed-off-by: Casey Callendrello <cdc@redhat.com>
args is an arbitrary json cookie, so we need to handle that rather than
assuming it's a string.

Signed-off-by: Casey Callendrello <cdc@redhat.com>
@squeed squeed force-pushed the feature/cluster-support branch from 1c9cd01 to 3d13c37 Compare August 30, 2021 12:44
Copy link
Collaborator

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @squeed

@dave-tucker dave-tucker merged commit b98d061 into ovn-org:main Aug 31, 2021
@coveralls
Copy link

Pull Request Test Coverage Report for Build 1182187084

  • 213 of 314 (67.83%) changed or added relevant lines in 5 files are covered.
  • 7 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.4%) to 75.269%

Changes Missing Coverage Covered Lines Changed/Added Lines %
ovsdb/serverdb/model.go 5 11 45.45%
client/client.go 201 296 67.91%
Files with Coverage Reduction New Missed Lines %
client/client.go 7 70.65%
Totals Coverage Status
Change from base Build 1163233557: -0.4%
Covered Lines: 3424
Relevant Lines: 4549

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Support connect to an OVN cluster
4 participants