-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More SDN code reorg #11137
More SDN code reorg #11137
Conversation
var err error | ||
var subnet *osapi.HostSubnet | ||
// Try every retryInterval and bail-out if it exceeds max retries | ||
for i := 0; i < retries; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use existing wait.ExponentialBackoff()
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, clayton pushed back on that for the CNI stuff too. See https://github.com/openshift/origin/pull/9981/files#diff-6357e2d44bec4f49542401c788e87f51R426 for an example.
return err | ||
} | ||
|
||
err = node.SubnetStartNode() | ||
if err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/danwinship/origin/blob/9df71e32e6c65fb5eed4177bfa36d14b8b84890e/pkg/sdn/plugin/node.go#L125
What happens to the pods when UpadePod() fails in case of network change? Networking for these pods will be broken. May be we can retry couple of times and log error if it didn't succeed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think retrying is likely to help; if we can't update the pod networking, then something is just broken. (Eg, OVS isn't running.) But I think if things are broken enough that UpdatePod() fails, then startup is going to fail for other reasons anyway
var err error | ||
var subnet *osapi.HostSubnet | ||
// Try every retryInterval and bail-out if it exceeds max retries | ||
for i := 0; i < retries; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, clayton pushed back on that for the CNI stuff too. See https://github.com/openshift/origin/pull/9981/files#diff-6357e2d44bec4f49542401c788e87f51R426 for an example.
ids: make(map[string]uint32), | ||
namespaces: make(map[uint32]sets.String), | ||
} | ||
return &nodeVNIDMap{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just saving memory or someting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's part of a set of changes to get rid of plugin.go:getVNID(); the fields get initialized by VnidStartNode() now, which only gets called for multitenant, so if GetVNID() sees that they haven't been initialized later, it can just return 0 for the VNID, rather than plugin.go having to make that assumption itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, This enables nil check (https://github.com/openshift/origin/pull/11137/files#diff-eabef4ae7ed24f933a3d3ac531af3814R65)
and GetVNID (https://github.com/openshift/origin/pull/11137/files#diff-5f233cf7267651424a495cc6b900f8d6R230) will return correct value both for subnet and multitenant plugins.
|
||
func (node *OsdnNode) watchServices() { | ||
services := make(map[string]*kapi.Service) | ||
RunEventQueue(node.kClient, Services, func(delta cache.Delta) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably update this to use eventqueue.NewEventQueueForStore() and make 'services' a cache.Store, but that could be done after.
return nil | ||
} | ||
|
||
func (node *OsdnNode) updatePodNetwork(namespace string, oldNetID, netID uint32) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updatePodNetwork() is only used by watchNetNamespaces() in vnids_node.go
I was expecting MultitenantStartNode() to watch for both NetNamespaces and Services to patch vnid when needed. I didn't understand the split between vnids_node.go and multitenant.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... did I explain that in the commit message? Maybe not... The idea is that there's going to be an openshift-ovs-networkpolicy network plugin soon as well, and it will use the VNID-tracking code, but it won't react to it in the same way. So vnids_node.go = NetNamespace watching, and multitenant.go = openshift-ovs-multitenant-specific policy, and there will later be networkpolicy.go as well.
The fact that updatePodNetwork() gets called directly from vnids_node.go is wrong, yeah, and that's going to change later. (Initially I was planning to just move code around in these commits, and not refactor anything, which is why this is like that. Although then I did end up making some code changes too, so maybe I should have fixed this...)
e6593e9
to
3a2ec09
Compare
OK, repushed without the vnids_node/multitenant split (I'll rework that a bit and do it with the rest of the networkpolicy branch), but with a new patch to use utilwait.ExponentialBackoff() where we should. |
3a2ec09
to
239a236
Compare
LGTM |
@pravisankar PTAL? |
[merge] |
[Test]ing while waiting on the merge queue |
LGTM |
2ea77a3
to
6cd347a
Compare
flake is #11240 [merge] |
The SDN initialization was being called from subnets.go for historical reasons; have node.go call it directly instead. Also, don't bother passing data to SetupSDN() that it could get from the OsdnNode structure itself (another historical artifact).
6cd347a
to
54b856a
Compare
flake is #11240, [test] |
Evaluated for origin test up to 54b856a |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9761/) |
Evaluated for origin merge up to 54b856a |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9787/) (Image: devenv-rhel7_5157) |
@openshift/networking PTAL