Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingress Healthcheck Configuration #42

Closed
bowei opened this issue Oct 11, 2017 · 69 comments
Closed

Ingress Healthcheck Configuration #42

bowei opened this issue Oct 11, 2017 · 69 comments
Assignees
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@bowei
Copy link
Member

bowei commented Oct 11, 2017

From @freehan on May 15, 2017 21:25

On GCE, ingress controller sets up default healthcheck for backends. The healthcheck will point to the nodeport of backend services on every node. Currently, there is no way to describe detail configuration of healthcheck in ingress. On the other side, each application may want to handle healthcheck differently. To bypass this limitation, on Ingress creation, ingress controller will scan all backend pods and pick the first ReadinessProbe it encounters and configure healthcheck accordingly. However, healthcheck will not be updated if ReadinessProbe was updated. (Refer: kubernetes/ingress-nginx#582)

I see 3 options going forward with healthcheck

  1. Expand the Ingress or Service spec to include more configuration for healthcheck. It should include the capabilities provided by major cloud providers, GCP, AWS...

  2. Keep using readiness probe for healthcheck configuration,
    a) Keep today's behavior and communicate clearly regarding the expectation. However, this still breaks the abstraction and declarative nature of k8s.
    b) Let ingress controller watch the backend pods for any updates for ReadinessProbe. This seems expensive and complicated.

  3. Only setup default healthcheck for ingresses. Ingress controller will only ensure the healthcheck exist periodically, but do not care about its detail configuration. User can configure it directly thru the cloud provider.

I am in favor of option 3). There are always more bells and whistles on different cloud providers. The higher layer we go, the more features we can utilize. For L7 LB, there is no clean simple way to describe every intention. So is the case for health check. To ensure a smooth experience, k8s still sets up the basics. For advance use cases, user will have to configure it thru the cloud provider.

Thoughts? @kubernetes/sig-network-misc

Copied from original issue: kubernetes/ingress-nginx#720

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @k8s-ci-robot on May 15, 2017 21:25

@freehan: These labels do not exist in this repository: sig/network.

In response to this:

On GCE, ingress controller sets up default healthcheck for backends. The healthcheck will point to the nodeport of backend services on every node. Currently, there is no way to describe detail configuration of healthcheck in ingress. On the other side, each application may want to handle healthcheck differently. To bypass this limitation, on Ingress creation, ingress controller will scan all backend pods and pick the first ReadinessProbe it encounters and configure healthcheck accordingly. However, healthcheck will not be updated if ReadinessProbe was updated. (Refer: kubernetes/ingress-nginx#582)

I see 3 options going forward with healthcheck

  1. Expand the Ingress or Service spec to include more configuration for healthcheck. It should include the capabilities provided by major cloud providers, GCP, AWS...

  2. Keep using readiness probe for healthcheck configuration,
    a) Keep today's behavior and communicate clearly regarding the expectation. However, this still breaks the abstraction and declarative nature of k8s.
    b) Let ingress controller watch the backend pods for any updates for ReadinessProbe. This seems expensive and complicated.

  3. Only setup default healthcheck for ingresses. Ingress controller will only ensure the healthcheck exist periodically, but do not care about its detail configuration. User can configure it directly thru the cloud provider.

I am in favor of option 3). There are always more bells and whistles on different cloud providers. The higher layer we go, the more features we can utilize. For L7 LB, there is no clean simple way to describe every intention. So is the case for health check. To ensure a smooth experience, k8s still sets up the basics. For advance use cases, user will have to configure it thru the cloud provider.

Thoughts? @kubernetes/sig-network-misc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @tonglil on August 14, 2017 20:8

For option 3, is the "default healthcheck" hitting the "default-backend"?

@edevil
Copy link

edevil commented Oct 18, 2017

I'm in favor of option 1.

I'm sure there is a minimum subset of features common to all cloud providers that make sense to be included in the ingress spec, and it would improve the current situation a lot.

@tobsch
Copy link

tobsch commented Dec 20, 2017

+1

@tonglil
Copy link
Contributor

tonglil commented Jan 10, 2018

For those in favor of option 1, please read this conversation: #28.

@hdave
Copy link

hdave commented Jan 10, 2018

Reading that conversation -- I think configuring healthcheck via annotations would be great.

@jeremywadsack
Copy link

@bowei I think that kubernetes-retired/contrib#325 is related to this?

@epsniff
Copy link

epsniff commented Feb 10, 2018

I ran into this as well and found this post: kubernetes/kubernetes#20555 (comment)

@hdave
Copy link

hdave commented Mar 22, 2018

I would be in favor of option #1 if the ingress controller could configure an HTTPS health check to a back end services that uses a cert. If not, I would go with option #3 and just let our devops team manually tweak the health check without the ingress controller caring and resetting it back.

@matti
Copy link

matti commented Jun 6, 2018

"on Ingress creation, ingress controller will scan all backend pods and pick the first ReadinessProbe it encounters and configure healthcheck accordingly"

I'm not seeing this, the health check will always point to default path "/" with:

        readinessProbe:
          httpGet:
            path: /health
            port: 8080

@Gogoro
Copy link

Gogoro commented Jun 7, 2018

I got the same issue as @matti. When I create an ingress pointing to a service, which again points to a pod it just keeps hitting / instead of the path I defined in readinessProbe and livenessProbe. I can see in the logs that the pod itself checks itself easily, but the healthcheck goes ham on /. :(

I've been trying for a while to find information on this topic, but I feel like it's not very well explained and documented. If anyone finds a solution for this it would be much appreciated!

@matti
Copy link

matti commented Jun 7, 2018

@Gogoro thanks, opened a new issue because this issue is for semantic discussion

@nicksardo nicksardo added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 16, 2018
@nicksardo
Copy link
Contributor

Healthcheck configuration should be provided via BackendConfig CRD and the readiness probe approach should be deprecated and eventually removed.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 14, 2018
@justinsb
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 23, 2018
@rramkumar1
Copy link
Contributor

rramkumar1 commented Oct 31, 2018

/help-wanted
/good-first-issue

@k8s-ci-robot
Copy link
Contributor

@rramkumar1:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

/help-wanted
/good-first-issue
/kind feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

taylorsilva added a commit to concourse/hush-house that referenced this issue Mar 14, 2020
This is an nginx server that does a redirect to a zoom room used by the
team for meetings like: standup, retro, everyone is wfh because covid-19

You may wonder: why do the redirect with an html page? You can do HTTP
301 redirect with just nginx redirect rules.

and you would be correct! But then other people's opinions get pushed
onto you and you have to do hacky things to get around it.

tldr: the ingress comes with a health checker that is not smart. See
kubernetes/ingress-gce#42 for more details.

Longer answer: The ingress used by GKE setups a health check that ONLY
hits '/' against your pod. Because my initial nginx.conf just did 301
redirects this healthcheck was forever failing because it would never
get HTTP 200. I tried to get around this by having nginx listen on
another port for health checking and always return HTTP 200 on that
port. I then had to expose both ports to the ingress controller. The
ingress controller decided to healthcheck BOTH ports that were exposed
on the container and send all traffic to port 9000 where it was getting
the HTTP 200's. At this point I threw my hands up and wrote some HTML to
do the redirect for me.

Signed-off-by: Taylor Silva <tsilva@pivotal.io>
@soichisumi
Copy link

soichisumi commented Apr 25, 2020

Can HealthCheckConfig (added in this PR) configure L7LB to pass health checks for grpc applications?

@bowei
Copy link
Member Author

bowei commented Apr 25, 2020

Hi everyone, please take a look at #1029 which implements the healthchecking overrides on the backendconfig for a service.

@mofirouz
Copy link

@bowei thanks a lot for your PR. I think this should work well with gRPC applications since we can use a custom path for Healthchecks.

Can you tell us what versions of GKE will this addition be available to use with?

@naseemkullah
Copy link

Hi everyone, please take a look at #1029 which implements the healthchecking overrides on the backendconfig for a service.

Thanks @bowei ! But are GCE-Ingress health check overrides available today ? Not seeing any related docs in https://cloud.google.com/kubernetes-engine/docs/concepts/backendconfig

How to keep track if it is available in a given version of GKE (e.g. stable channel) ?

@dustinmoris
Copy link

Can someone post here what the recommended fix is? What does a BackendConfig with a health check look like? What is the min Kubernetes version where this feature is supported?

@bowei
Copy link
Member Author

bowei commented May 5, 2020

the docs will be updated very soon -- @spencerhance

@jbg
Copy link

jbg commented May 28, 2020

An example BackendConfig using this feature:

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: my-backend-config
spec:
  healthCheck:
    checkIntervalSec: 20
    timeoutSec: 1
    healthyThreshold: 1
    unhealthyThreshold: 3
    type: TCP
    # defaults to serving port
    # port:
    # only for HTTP/HTTPS type
    # path:

ref: https://godoc.org/k8s.io/ingress-gce/pkg/apis/backendconfig/v1#HealthCheckConfig

It doesn't appear to work though (health check is still created as type HTTP and path "/" regardless of what I configure in the BackendConfig). Is there any GKE release that supports this yet?

@mofirouz
Copy link

I've tried the following setup with no success sadly:

My app (nakama) exposes two ports - 7110 for gRPC (HTTP/2) and 7111 for gRPC-gateway (HTTP 1.1 with / used for LB healthchecks).

This is running on a GKE instance version 1.16.8-gke.10.

kind: Service
apiVersion: v1
metadata:
  name: nakama3
  namespace: heroiclabs
  labels:
    project: heroiclabs
  annotations:
    cloud.google.com/app-protocols: '{"nakama-api":"HTTP","nakama-grpc-api":"HTTP2"}'
    beta.cloud.google.com/backend-config: >-
      {"ports":{"nakama-api":"backendconfig","nakama-grpc-api":"backendconfig"}, "default": "backendconfig"}
    cloud.google.com/neg: '{"ingress": true}'
spec:
  ports:
    - name: nakama-grpc-api
      protocol: TCP
      port: 7110
      targetPort: 7110
    - name: nakama-api
      protocol: TCP
      port: 7111
      targetPort: 7111
  selector:
    app: nakama
  type: NodePort

backendconfig.yml

kind: BackendConfig
apiVersion: cloud.google.com/v1
metadata:
  labels:
    project: heroiclabs
  name: backendconfig
  namespace: heroiclabs
spec:
  connectionDraining:
    drainingTimeoutSec: 5
  logging:
    sampleRate: 0.0
  timeoutSec: 86400
  healthCheck:
    port: 7111
    checkIntervalSec: 10

ingress.yml

kind: Ingress
apiVersion: extensions/v1beta1
metadata:
  name: gundam3
  namespace: heroiclabs
  labels:
    project: heroiclabs
  annotations:
    ingress.gcp.kubernetes.io/pre-shared-cert: heroiclabs
    kubernetes.io/ingress.allow-http: 'false'
spec:
  backend:
    serviceName: gundam3
    servicePort: 7110

The result is the following:

lb1

lb2


Apologies for tagging you individually guys, but @bowei or @rramkumar1 can you shed some light on what might be going wrong here?

@bowei
Copy link
Member Author

bowei commented Jun 17, 2020

cc: @spencerhance

@spencerhance
Copy link
Contributor

This feature has not rollout out yet, but will be available in the next release (1.17.6-gke.7) next week, which the exception of port configuration. That will require a bug fix that should roll out a few weeks after. Additionally, this feature won't be available in 1.16 clusters until about a month after it has been released in 1.17.

@mofirouz
Copy link

Thanks @spencerhance and @bowei for the prompt response - I'll keep a look out for that.

@ahmetgeymen
Copy link

ahmetgeymen commented Jun 18, 2020

...
readinessProbe:
  httpGet:
    path: /actuator/health
    port: 9090
  initialDelaySeconds: 5
...

I can manage to have healthy ingress for around 10 minutes by giving custom health check (with node port assigned with readinessProbe port) for automatically created backend service config for load balancer which created with ingress. After about 10 minutes health check returns to look for default node port and then I get 502 again for external ip.

(Using screenshot of @mofirouz, but the same config field I mentioned above.)

84936824-bdabe000-b0d2-11ea-9324-c80e49d1f33c

https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#ingress_for_external_and_internal_traffic

Unfortunately they warn about not to update load balancer config manually. But without doing anything manually, I can not reach my service from assigned ip of the load balancer. By the way, I didn't try updating the actuator's health endpoint to / yet.


Edit

As mentioned here I tried to give exactly same port with the container port for readinessProbe

...
ports:
- containerPort: 8080
  name: actuator
readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: actuator
...

Then creating Ingress again provides health check config of the load balancer to look for the new path instead /. It is important to use same port for the readinessProbe with container port to keep healthy ingress. May be incoming healthcheck feature of ingress would notice additional port config.

@bowei
Copy link
Member Author

bowei commented Jun 18, 2020

@ahmetgeymen -- hopefully after the healthcheck feature is available, the need to edit the healthcheck settings manually will go away. Let us know if there is anything that remains that makes custom configuration necessary..

@Gatsby-Lee
Copy link

Gatsby-Lee commented Jun 30, 2020

This seems available on beta. I am on 1.17.6-gke.11.

I don't need change in Ingress.

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: config-default
spec:
  healthCheck:
    checkIntervalSec: 10
    timeoutSec: 3
    requestPath: /healthz
apiVersion: v1
kind: Service
metadata:
  name: test-healthz
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
    beta.cloud.google.com/backend-config: '{"default": "config-default"}'
spec:
  ports:
    - port: 8080
      protocol: TCP
      targetPort: 8080
  selector:
    app: test-healthz
  type: NodePort

@gnarea
Copy link

gnarea commented Jun 30, 2020

I'm using v1.17.6-gke.7 but can't get this to work with gRPC. I basically want to use TCP (not HTTP2) health checks because those HTTP2 health checks don't work at all with gRPC.

Here's the resources I have:

$ kubectl get svc gw-test-relaynet-internet-gateway-cogrpc -o yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    beta.cloud.google.com/backend-config: '{"ports":{"grpc":"cogrpc"}, "default":
      "cogrpc"}'
    cloud.google.com/app-protocols: '{"grpc":"HTTP2"}'
    cloud.google.com/neg: '{"ingress": true}'
    cloud.google.com/neg-status: '{"network_endpoint_groups":{"8081":"k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572"},"zones":["europe-west2-a"]}'
    meta.helm.sh/release-name: gw-test
    meta.helm.sh/release-namespace: default
    service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2"}'
  creationTimestamp: "2020-06-29T14:21:27Z"
  labels:
    app.kubernetes.io/instance: gw-test
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
    app.kubernetes.io/version: 1.3.9
    helm.sh/chart: relaynet-internet-gateway-0.1.0
  name: gw-test-relaynet-internet-gateway-cogrpc
  namespace: default
  resourceVersion: "606508"
  selfLink: /api/v1/namespaces/default/services/gw-test-relaynet-internet-gateway-cogrpc
  uid: 9e28d795-f1ee-49ab-a307-1424b016d46a
spec:
  clusterIP: 10.16.5.230
  externalTrafficPolicy: Cluster
  ports:
  - name: grpc
    nodePort: 32112
    port: 8081
    protocol: TCP
    targetPort: grpc
  selector:
    app.kubernetes.io/instance: gw-test
    app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
  sessionAffinity: None
  type: NodePort

$ kubectl get backendconfig cogrpc -o yaml
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  annotations:
  creationTimestamp: "2020-06-30T16:50:36Z"
  generation: 2
  labels:
    project: public-gw
  name: cogrpc
  namespace: default
  resourceVersion: "608964"
  selfLink: /apis/cloud.google.com/v1/namespaces/default/backendconfigs/cogrpc
  uid: 9387cc5d-6710-4c7e-99ed-dc78f124da5f
spec:
  healthCheck:
    checkIntervalSec: 20
    healthyThreshold: 1
    port: 8081
    timeoutSec: 1
    type: TCP
    unhealthyThreshold: 3

The old HTTP2 healthcheck is still used. In fact, I can't see the healthcheck that should've been created by the BackendConfig above (which, according to kubectl, is in the right cluster and namespace) -- I can see it in k8s but not GCP (I'm sure the project label is set to the right value).

Any idea what I'm doing wrong?

@Gatsby-Lee
Copy link

@gnarea
I don't know much about grpc
maybe you can get some idea from here - https://cloud.google.com/compute/docs/reference/rest/v1/healthChecks

@gnarea
Copy link

gnarea commented Jul 1, 2020

Thanks @Gatsby-Lee! I think that link helps with the structure of the data of the .spec, but I don't think that's the problem here. According to kubectl describe backendconfig cogrpc, the values I specified seem to have been accepted as valid. The problem is that the LB isn't using my custom health check -- And in fact, I can't find that healthcheck on GCP, but I can still see my BackendConfig resource in the cluster, which I guess is why the LB can't use it.

I also gave up on the TCP probe because I couldn't get it to work and, even if I eventually did, it'd be far too unreliable for a health check. Instead, per the suggestion of a GCP support agent, I've now created a new container in the pod, which is an HTTP app with a single endpoint that in turn pings my gRPC service using the gRPC health check protocol. This approach is basically Option 3 in https://kubernetes.io/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/ (except that I'm doing HTTP probes instead of exec probes to make the GCP LB healthchecks work)

To sum up, I want to configure the gRPC backend in the LB in such a way that the health check points to the HTTP proxy containers but the actual backend uses the gRPC containers. This is the kind of things you'd be able to do with the fix in this issue, right? If so, how can I configure the BackendConfig, Service and potentially Ingress to achieve this?

Here's my current service, backendconfig and deployment in case that's useful:

$ kubectl get svc gw-test-relaynet-internet-gateway-cogrpc -o yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/app-protocols: '{"grpc":"HTTP2"}'
    cloud.google.com/neg: '{"ingress": true}'
    cloud.google.com/neg-status: '{"network_endpoint_groups":{"8081":"k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572"},"zones":["europe-west2-a"]}'
    meta.helm.sh/release-name: gw-test
    meta.helm.sh/release-namespace: default
    service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2"}'
  creationTimestamp: "2020-06-29T14:21:27Z"
  labels:
    app.kubernetes.io/instance: gw-test
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
    app.kubernetes.io/version: 1.3.10
    helm.sh/chart: relaynet-internet-gateway-0.1.0
  name: gw-test-relaynet-internet-gateway-cogrpc
  namespace: default
  resourceVersion: "989264"
  selfLink: /api/v1/namespaces/default/services/gw-test-relaynet-internet-gateway-cogrpc
  uid: 9e28d795-f1ee-49ab-a307-1424b016d46a
spec:
  clusterIP: 10.16.5.230
  externalTrafficPolicy: Cluster
  ports:
  - name: grpc
    nodePort: 32112
    port: 8081
    protocol: TCP
    targetPort: grpc
  - name: health-check
    nodePort: 32758
    port: 8082
    protocol: TCP
    targetPort: health-check
  selector:
    app.kubernetes.io/instance: gw-test
    app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

$ kubectl get backendconfig cogrpc -o yaml
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  annotations:
  creationTimestamp: "2020-07-01T10:41:21Z"
  generation: 1
  labels:
    project: public-gw
  name: cogrpc
  namespace: default
  resourceVersion: "1026811"
  selfLink: /apis/cloud.google.com/v1/namespaces/default/backendconfigs/cogrpc
  uid: 57c6219f-91ee-4e45-9576-a817c958dc3c
spec:
  healthCheck:
    checkIntervalSec: 20
    healthyThreshold: 1
    port: 8082
    timeoutSec: 1
    type: HTTP
    unhealthyThreshold: 3

$ kubectl get deploy gw-test-relaynet-internet-gateway-cogrpc -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "15"
    meta.helm.sh/release-name: gw-test
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2020-06-29T14:21:27Z"
  generation: 15
  labels:
    app.kubernetes.io/instance: gw-test
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
    app.kubernetes.io/version: 1.3.10
    helm.sh/chart: relaynet-internet-gateway-0.1.0
  name: gw-test-relaynet-internet-gateway-cogrpc
  namespace: default
  resourceVersion: "1012358"
  selfLink: /apis/apps/v1/namespaces/default/deployments/gw-test-relaynet-internet-gateway-cogrpc
  uid: f8c67784-e2ed-4f56-89cc-7763f1adf059
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: gw-test
      app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: gw-test
        app.kubernetes.io/name: relaynet-internet-gateway-cogrpc
    spec:
      containers:
      - command:
        - node
        - build/main/bin/cogrpc-server.js
        env:
        - name: COGRPC_ADDRESS
          value: https://cogrpc-test.relaycorp.tech
        envFrom:
        - configMapRef:
            name: gw-test-relaynet-internet-gateway
        - configMapRef:
            name: gw-test-relaynet-internet-gateway-cogrpc
        - secretRef:
            name: gw-test-relaynet-internet-gateway
        image: quay.io/relaycorp/relaynet-internet-gateway:v1.3.10
        imagePullPolicy: IfNotPresent
        name: cogrpc
        ports:
        - containerPort: 8080
          name: grpc
          protocol: TCP
        resources: {}
        securityContext: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - command:
        - /bin/grpc_health_proxy
        - -http-listen-addr
        - 0.0.0.0:8081
        - -grpcaddr
        - 127.0.0.1:8080
        - -service-name
        - CargoRelay
        - -v
        - "10"
        image: salrashid123/grpc_health_proxy:1.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /
            port: health-check
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: cogrpc-health-check
        ports:
        - containerPort: 8081
          name: health-check
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /
            port: health-check
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: gw-test-relaynet-internet-gateway-cogrpc
      serviceAccountName: gw-test-relaynet-internet-gateway-cogrpc
      shareProcessNamespace: true
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2020-07-01T09:48:12Z"
    lastUpdateTime: "2020-07-01T09:48:12Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2020-07-01T09:46:21Z"
    lastUpdateTime: "2020-07-01T10:04:47Z"
    message: ReplicaSet "gw-test-relaynet-internet-gateway-cogrpc-5c4f6d9cf6" has
      successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 15
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

All pods are running properly as you can see in the deployment status.

And as you'll see below, the healtcheck for this backend is connecting to the gRPC service (port: 32112) over HTTP2 instead of the HTTP proxy (node port 32578 according to kubectl describe svc gw-test-relaynet-internet-gateway-cogrpc) over HTTP:

$ gcloud compute backend-services describe --global --project public-gw k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572 
affinityCookieTtlSec: 0
backends:
- balancingMode: RATE
  capacityScaler: 1.0
  group: https://www.googleapis.com/compute/v1/projects/public-gw/zones/europe-west2-a/networkEndpointGroups/k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572
  maxRatePerEndpoint: 1.0
connectionDraining:
  drainingTimeoutSec: 0
creationTimestamp: '2020-06-29T08:43:45.264-07:00'
description: '{"kubernetes.io/service-name":"default/gw-test-relaynet-internet-gateway-cogrpc","kubernetes.io/service-port":"grpc","x-features":["HTTP2","NEG"]}'
enableCDN: false
fingerprint: tVJFuu8kHHM=
healthChecks:
- https://www.googleapis.com/compute/v1/projects/public-gw/global/healthChecks/k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572
id: '9026318342467072734'
kind: compute#backendService
loadBalancingScheme: EXTERNAL
logConfig:
  enable: true
name: k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572
port: 32112
portName: port32112
protocol: HTTP2
selfLink: https://www.googleapis.com/compute/v1/projects/public-gw/global/backendServices/k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572
sessionAffinity: NONE
timeoutSec: 30

$ gcloud compute health-checks describe --global --project public-gw k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572 
checkIntervalSec: 15
creationTimestamp: '2020-06-29T08:43:42.958-07:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
http2HealthCheck:
  portSpecification: USE_SERVING_PORT
  proxyHeader: NONE
id: '327529895999025857'
kind: compute#healthCheck
name: k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572
selfLink: https://www.googleapis.com/compute/v1/projects/public-gw/global/healthChecks/k8s1-4ddb902c-defaul-gw-test-relaynet-internet-gate-80-01fab572
timeoutSec: 15
type: HTTP2
unhealthyThreshold: 2

@Gatsby-Lee
Copy link

I was wrong. I don't think setting healthcheck through Backendconfig works.
Ingress can use custom healthcheck only if Service exists before Ingress knows about the Service.

And, even if custom healthcheck works by bringing up Ingress after Service,
the custom healthcheck won't work if new Pod is deployed.

This is what @nicksardo explained before in different msg thread.

BTW, ingress doesn't have to be removed to use custom healthcheck.
if Service info is removed from Ingress and added back again, then Ingress uses custom healthcheck again.
(But, we can't do this on prod) lol

@dpkirchner
Copy link

This feature has not rollout out yet, but will be available in the next release (1.17.6-gke.7) next week, which the exception of port configuration. That will require a bug fix that should roll out a few weeks after. Additionally, this feature won't be available in 1.16 clusters until about a month after it has been released in 1.17.

May I suggest/request adding this to the Feature Comparison table at https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-features ? I see custom health checks are marked as being in beta (for Internal Ingress[1]), but there's no version number so it's not clear how to "activate" the beta feature.

[1] I assume this includes ingresses used for IAP given the healthCheck attribute doesn't have an effect in BackendConfigs used there.

@christopherdbull
Copy link

Is there any reason why we can't set the Host header with the BackendConfig CRD?

@ok-ikaros
Copy link

Hey guys,

Currently I have a service that publishes port 12345 as the service port. I want this to be the port that Ingress routes traffic to because it's the port that listens to websocket messages. I'm making a game by the way, if this adds some color

The Ingress controller looks like this:

spec:
  defaultBackend:
    service:
      name: other-backend-service
      port:
        number: 7350
  rules:
  - http:
      paths:
      - backend:
          service:
            name: my-server-service
            port:
              number: 12345
        path: /my-server
        pathType: ImplementationSpecific

The service listens to websocket connections on port 12345, but I've also spun up a small python server that runs in the same pod experimentally just for the sole purpose of passing the health check. It listens to http requests on / port 80, and return 200 OK. I've gotten it to pass the health check, but only when I publish this port as the service port. When I do this, traffic fails to route to the other port listening for websocket traffic, as they are on the same path

Is there a way for me to configure the health check so that it checks against port 80 without having to publish it as the service port on the same path as the port I want to route traffic to?

I really want to stay in the GCE ingress ecosystem so I would love to get this to work if possible.
Thank you so much

@swetharepakula
Copy link
Member

@lelandhwu , can you please create a new issue with the same content as #42 (comment). This seems like a separate issue from Healthcheck configuration.

The remaining ask on this issue is to add Host Header to the BackendConfig. Please open a different issue if this FR is still desired.

Since this issue is specific to custom healthchecks we are closing it out. Custom healthchecks are configurable through the BackendConfig CRD: #42 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests