Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor syncInternalImpl and toZoneNetworkEndpointMap #2044

Merged
merged 6 commits into from
Apr 6, 2023

Conversation

sawsa307
Copy link
Contributor

@sawsa307 sawsa307 commented Mar 28, 2023

Refactor syncInternalImpl, toZoneNetworkEndpointMap, and toZoneNetworkEndpointMapDegradedMode.

  1. Refactor syncInternalImpl.
  • Create convertUntypedToEPS for converting endpointslice, and computeEPSStaleness for compute EPS staleness metrics.
  • Create getEndpointsCalculation() to account for endpoint calculation for error state and enableDegradedMode.
  1. Refactor toZoneNetworkEndpointMap.
  • Reduce the number of return values by wrapping them in a struct ZoneNetworkEndpointMapResult
  • Create helper function getEndpointPod(), getEndpointZone() to simplify endpoint node, pod, and zone checking,
  • Refactor endpoint node, pod, and zone checking
  1. Refactor toZoneNetworkEndpointMapDegradedMode.
  • Reduce the number of return values by wrapping them in a struct ZoneNetworkEndpointMapResult
  • Refactor endpoint node, pod, and zone checking
  • Refactor and remove validateAndAddEndpoints since most of the code about validating(https://github.com/kubernetes/ingress-gce/blob/master/pkg/neg/syncers/utils.go#L337-L357) does not need to be in the for loop
  • Update unit tests for validateAndAddEndpoints since this function no longer exists, move the checks to TestToZoneNetworkEndpointMapDegradedMode

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 28, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @sawsa307. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 28, 2023
@sawsa307
Copy link
Contributor Author

/assign @bowei

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 28, 2023
ResultInProgress = "InProgress"
ResultSuccess = "Success"
ResultNegNotFound = Result("NegNotFound")
ResultCurrentEPNotFound = Result("CurrentEPNotFound")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to document the Results that aren't obvious.

What is the difference between CurrentEPNotFound and EPSNotFound

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CurrentEPNotFound means we cannot get existing endpoints from NEG, while EPSNotFound means we fail to get EPS for this service.
For documenting the results, do you mean expand what's in the Result("...") or I should add it at a separate place?

ResultEPSEndpointCountZero = Result("EPSEndpointCountZero")
ResultEPCalculationCountZero = Result("EPCalculationCountZero")
ResultInvalidAPIResponse = Result("InvalidAPIResponse")
ResultInvalidEPAttach = Result("InvalidEPAttach")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid or error

Invalid means we sent an invalid argument to the API, error means we got an error, which could include invalid.

Copy link
Contributor Author

@sawsa307 sawsa307 Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be invalid. Endpoint update API calls would fail if the endpoint list we send contains endpoints with invalid fields.

zoneNetworkEndpointMap := map[string]negtypes.NetworkEndpointSet{}
networkEndpointPodMap := negtypes.EndpointPodMap{}
dupCount := 0
if eds == nil {
klog.Errorf("Endpoint object is nil")
return zoneNetworkEndpointMap, networkEndpointPodMap, dupCount, nil
return negtypes.ZoneNetworkEndpointMapResult{
Copy link
Member

@bowei bowei Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't put all the types into the types package. This is an antipattern.

Just keep the return type struct next to the function that returns the struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update this. Thanks!

@@ -246,24 +257,44 @@ func toZoneNetworkEndpointMap(eds []negtypes.EndpointsData, zoneGetter negtypes.

for _, endpointAddress := range ed.Addresses {
if endpointAddress.AddressType != discovery.AddressTypeIPv4 {
klog.Infof("Skipping non IPv4 address: %q, in endpoint slice %s/%s", endpointAddress.Addresses, ed.Meta.Namespace, ed.Meta.Name)
klog.Infof("Skipping non IPv4 address: %q, in endpoint slice %s/%s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra newline here to make your diff smaller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I'll update this.

NetworkEndpointSet: zoneNetworkEndpointMap,
EndpointPodMap: networkEndpointPodMap,
DupCount: dupCount,
Err: nil,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should probably leave err out of the struct. That way we use the common pattern:

result, err := foo()
if err != nil { 
  // Do something, we expect result to be nil or an empty value.
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to keep your code using standard, well understood patterns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I'll update this.

@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch 3 times, most recently from 776bb3f to e01bdb7 Compare March 28, 2023 23:27
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 28, 2023
@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch 6 times, most recently from 6ce18fb to 18b97bb Compare March 29, 2023 23:36
@sawsa307
Copy link
Contributor Author

/assign @swetharepakula

@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch 2 times, most recently from 516181f to 397f4f2 Compare March 30, 2023 20:09
@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch from 1e89da1 to db38c4d Compare April 4, 2023 22:47
@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch 4 times, most recently from 3eee300 to b79ead5 Compare April 5, 2023 18:53
pkg/neg/syncers/utils.go Show resolved Hide resolved
pkg/neg/syncers/transaction.go Show resolved Hide resolved
@@ -176,16 +188,18 @@ type L7EndpointsCalculator struct {
zoneGetter types.ZoneGetter
servicePortName string
podLister cache.Indexer
nodeLister cache.Indexer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the indexer, use the lister like in the L4 case.

not in this PR, but we should change the podLister to also be the lister instead of the indexer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'll create a separate PR for that

pkg/neg/syncers/utils.go Show resolved Hide resolved
@@ -440,17 +440,17 @@ func validatePod(pod *apiv1.Pod, nodeLister cache.Indexer) bool {
// Terminal Pod means a pod is in PodFailed or PodSucceeded phase
phase := pod.Status.Phase
if phase == apiv1.PodFailed || phase == apiv1.PodSucceeded {
klog.V(2).Info("Pod %s/%s is a terminal pod with status %v, skipping", pod.ObjectMeta.Namespace, pod.ObjectMeta.Name, phase)
klog.V(2).Infof("Pod %s/%s is a terminal pod with status %v, skipping", pod.ObjectMeta.Namespace, pod.ObjectMeta.Name, phase)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these log lines still do not make sense in terms of this utility function. We can leave for a followup PR, but this should probably return an error or just the bool and allow the caller to make a decision about whether to skip or not.

Copy link
Contributor Author

@sawsa307 sawsa307 Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. The decision about skipping should be left to the caller. I'll do it in a separate PR.
I think we already define some errors like ErrEPNodeNotFound(endpoint corresponds to an non-existing node), but here we are checking node using Pod.
If I want to consolidate it and use one error for both endpoints and pods, how should I phrase my error message?

sawsa307 added 3 commits April 6, 2023 10:08
consolidate service name and namespace since we would include
addPodsToLister() in tests for syncInternalImpl, and addPodsToLister
uses testServiceNamespace instead of TestNamespace, and change
testService to testServiceName to match the behavior in
getDefaultEndpointSlices().
@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch 2 times, most recently from 8b3ca95 to 59f53dc Compare April 6, 2023 21:44
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 6, 2023
@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch 4 times, most recently from 2f57f03 to c5c9ee8 Compare April 6, 2023 21:51
@sawsa307 sawsa307 force-pushed the refactor-syncInternalImpl branch from c5c9ee8 to 6450e61 Compare April 6, 2023 21:58
@sawsa307 sawsa307 requested a review from swetharepakula April 6, 2023 23:01
@swetharepakula
Copy link
Member

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 6, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sawsa307, swetharepakula

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit 4c7c629 into kubernetes:master Apr 6, 2023
@sawsa307 sawsa307 deleted the refactor-syncInternalImpl branch September 2, 2023 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants