Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ExternalTrafficPolicy #451

Closed
akutz opened this issue Nov 8, 2020 · 11 comments
Closed

Support for ExternalTrafficPolicy #451

akutz opened this issue Nov 8, 2020 · 11 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@akutz
Copy link

akutz commented Nov 8, 2020

What would you like to be added:

It would be nice if the Service API spec supported some version of the externalTrafficPolicy field from the existing services spec:

    // externalTrafficPolicy denotes if this Service desires to route external
    // traffic to node-local or cluster-wide endpoints. "Local" preserves the
    // client source IP and avoids a second hop for LoadBalancer and Nodeport
    // type services, but risks potentially imbalanced traffic spreading.
    // "Cluster" obscures the client source IP and may cause a second hop to
    // another node, but should have good overall load-spreading.
    // +optional
    ExternalTrafficPolicy ServiceExternalTrafficPolicyType `json:"externalTrafficPolicy,omitempty" protobuf:"bytes,11,opt,name=externalTrafficPolicy"`

Why is this needed:

Because the existing services resource supports it, and it makes sense to implement this in Service APIs as well. While it's true that the presence of a health check port (#97) alone would allow a back-end load balancer platform to determine whether or not an endpoint is healthy, i.e. whether a pod is running on a given node in the Kubernetes cluster. This would allow:

  • Load balancer algorithms to be more accurate as only the nodes on which a pod is running would be considered valid endpoints
  • The preservation of the client's source IP instead of a node in a cluster forwarding traffic to another node where a pod is running

To understand how the load balancer would interact with this health check, here is an example:

  1. Create a Kind configuration file that specifies a cluster with one control plane node and two worker nodes:

    cat <<EOF >kind.conf
    kind: Cluster
    apiVersion: kind.x-k8s.io/v1alpha4
    nodes:
    - role: control-plane
    - role: worker
    - role: worker
    EOF
  2. Create a new Kind cluster:

    kind create cluster --config kind.conf --kubeconfig kube.conf
  3. Persist the IP address of each node in the cluster to a file:

    kubectl --kubeconfig=kube.conf get nodes -ojsonpath='{range .items[*]}{.status.addresses[0].address}{"\n"}{end}' >"node-addrs.txt"
  4. Deploy Nginx to the new cluster and provide a service for the application:

    cat <<EOF | kubectl --kubeconfig=kube.conf apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      namespace: default
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
    spec:
      selector:
        app: nginx
      ports:
      - port: 80
      externalTrafficPolicy: Local
      type: LoadBalancer
    EOF
  5. Get the port used to perform a health check on the service:

    while [ -z "${PORT}" ]; do \
      export PORT="$(kubectl --kubeconfig=kube.conf get service nginx -ojsonpath='{.spec.healthCheckNodePort}')"; \
    done
  6. Query the service's health port to determine the node on which the pod is running:

    for ip in $(cat "node-addrs.txt"); do \
      echo "checking node ${ip}" && \
      docker run --rm photon curl -sS "http://${ip}:${PORT}" && \
      printf '\n\n'; \
    done

    This will result in output similar to the following:

    checking node 172.17.0.4
    {
    	"service": {
    		"namespace": "default",
    		"name": "nginx"
    	},
    	"localEndpoints": 0
    }
    
    checking node 172.17.0.2
    {
    	"service": {
    		"namespace": "default",
    		"name": "nginx"
    	},
    	"localEndpoints": 1
    }
    
    checking node 172.17.0.3
    {
    	"service": {
    		"namespace": "default",
    		"name": "nginx"
    	},
    	"localEndpoints": 0
    }

In the above example, the load balancer platform would be able to set up an L7 health check and determine that only node 172.17.0.2 is valid, as that is where the pod is currently running.

There are few alternatives to supporting this:

  1. Dynamically update route endpoints -- this is not desired because there may be time between a pod moving between nodes and a controller updating the routes associated with a gateways resource's listener. Not to mention updating the backend load balancer platform to point to the new endpoints for a listener.

  2. Add all cluster node IP addresses -- this would work, but it's not nearly as efficient as supporting a distinct health port since it will take time for the load balancer to determine whether or not an endpoint is heatlhy by dropped traffic. Whereas a proper health check will happen right away.

Related to #97

@akutz akutz added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 8, 2020
@maplain
Copy link

maplain commented Nov 18, 2020

@akutz see #196 for service level configuration related discussion.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2021
@robscott
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 16, 2021
@robscott robscott added this to the v0.3.0 milestone Mar 24, 2021
@robscott robscott added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 14, 2021
@robscott robscott removed this from the v0.3.0 milestone Apr 23, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 22, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 21, 2021
@hbagdi
Copy link
Contributor

hbagdi commented Aug 23, 2021

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Aug 23, 2021
@shaneutt
Copy link
Member

shaneutt commented Aug 4, 2022

While grooming we saw that this one was open for a long period of time without anyone with a strong use case to champion it. We're going to close this as we don't expect anyone's ready to drive it forward, but if you still want this feature and have a strong use case we will be happy to reconsider this and re-open.

@shaneutt shaneutt closed this as not planned Won't fix, can't repro, duplicate, stale Aug 4, 2022
@jorhett
Copy link

jorhett commented Sep 4, 2023

Please re-open this. Not acting on this forces us to patch the service after every deployment--which is an anti-pattern--as documented here https://istio.io/latest/docs/tasks/security/authorization/authz-ingress/#network

The Kubernetes model is to be declarative. Hack-patching imperatively after every apply is definitely not. Even using an admission controller to do it is WTF hacking. I don't see why this change is so objectionable.

#196 mentions this issue as being unsatisfactory, but was closed without solving this issue.

@youngnick
Copy link
Contributor

@jorhett, this issue is super old and we're not handling these concepts in the same way at all any more. Because of that, this issue won't be reopened as it stands.

However, if you would like us to handle a use case, I suggest opening an issue describing the outcome you want, and we can talk more there.

@jorhett
Copy link

jorhett commented Sep 6, 2023

Can you explain what you're looking for? Because "allow me to set externalTrafficPolicy: local" is pretty clear and unambiguous. We require the original client IP, and this is the only way to get it.

In fact, every word in this issue description is valid AFAICT because he's describing the exact same problem. I don't see how you having changed whatever you changed makes this request invalid, as it's still not possible now.

@youngnick
Copy link
Contributor

Okay, your requirement is not externalTrafficPolicy: local then, it's "I want to be able to get the original client IP", presumably for layer 4 connections, since it's completely possible for HTTP connections using X-Forwarded-For or similar mechanisms.

I agree, that is not currently possible, but trying to reopen a three year old issue that proposes a solution, not a problem, is not a productive way of engaging with the community.

The API is no longer called the "Services API" for a reason - because we decided to make the constructs more general than only Kubernetes Services. It's called the Gateway API now because we intended there to be implementations that don't use a Service for describing how Layer 4 traffic gets into a cluster. That's what Gateways and Listeners are for.

Now, is there space for a GEP (a Gateway Enhancement Proposal) that describes how a Gateway or Listener can request a similar behavior to ExternalTrafficPolicy: local for a Loadbalancer Service? Absolutely.

But the way that the spec will be defined won't be what's listed in the initial comment here, because the constructs have totally changed in the three years since this issue was logged.

As I said earlier, if you really need this feature, logging a new issue is the way to go.

@kubernetes-sigs kubernetes-sigs locked as resolved and limited conversation to collaborators Sep 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

10 participants