Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Local KService generates ExternalIP Ingress #7233

Closed
yanniszark opened this issue Mar 12, 2020 · 16 comments
Closed

Cluster Local KService generates ExternalIP Ingress #7233

yanniszark opened this issue Mar 12, 2020 · 16 comments
Assignees
Labels
area/networking kind/bug Categorizes issue or PR as related to a bug.

Comments

@yanniszark
Copy link

In what area(s)?

/area networking

What version of Knative?

0.11.x

More specifically: https://github.com/kubeflow/manifests/tree/v1.0-branch/knative/knative-serving-install/base

Expected Behavior

I have a cluster-local-gateway in the istio-system namespace.

I have also edit config-istio to look like this:

data:
  local-gateway.knative-serving.cluster-local-gateway: cluster-local-gateway.istio-system.svc.cluster.local
  reconcileExternalGateway: "false"

And config-domain:

data:
  svc.cluster.local: ""

Then, I create the following KNative Service:

kind: Service 
metadata: 
  annotations: 
    serving.knative.dev/creator: system:serviceaccount:kubeflow:default 
    serving.knative.dev/lastModifier: system:serviceaccount:kubeflow:default 
  creationTimestamp: "2020-03-12T19:04:42Z" 
  generation: 1 
  name: flowers-sample-pvc-predictor-default 
  namespace: default 
  ownerReferences: 
  - apiVersion: serving.kubeflow.org/v1alpha2 
    blockOwnerDeletion: true 
    controller: true 
    kind: InferenceService 
    name: flowers-sample-pvc 
    uid: 54825d51-6494-11ea-9209-42010a80002d 
  resourceVersion: "1938917" 
  selfLink: /apis/serving.knative.dev/v1/namespaces/default/services/flowers-sample-pvc-predictor-default 
  uid: 548cd184-6494-11ea-9209-42010a80002d 
spec: 
  template: 
    metadata: 
      annotations: 
        autoscaling.knative.dev/class: kpa.autoscaling.knative.dev 
        autoscaling.knative.dev/target: "1" 
        internal.serving.kubeflow.org/storage-initializer-sourceuri: pvc://kfserving-pvc/flowers 
        queue.sidecar.serving.knative.dev/resourcePercentage: "0.2" 
      creationTimestamp: null 
      labels: 
        serving.kubeflow.org/inferenceservice: flowers-sample-pvc 
    spec: 
      containerConcurrency: 0 
      containers: 
      - args: 
        - --port=9000 
        - --rest_api_port=8080 
        - --model_name=flowers-sample-pvc 
        - --model_base_path=/mnt/models 
        command: 
        - /usr/bin/tensorflow_model_server 
        image: tensorflow/serving:1.14.0 
        name: kfserving-container 
        readinessProbe: 
          successThreshold: 1 
          tcpSocket: 
            port: 0 
        resources: 
          limits: 
            cpu: "1" 
            memory: 2Gi 
          requests: 
            cpu: "1" 
            memory: 2Gi 
      timeoutSeconds: 60 
  traffic: 
  - latestRevision: true 
    percent: 100 

I expect that the KNative Service will be exposed only via the cluster-local-gateway.

Actual Behavior

I get IngressNotConfigured:

$ kubectl get services.serving.knative.dev

NAME                                   URL                                                                     LATESTCREATED                                LATESTREADY                                  READY     REASON
flowers-sample-pvc-predictor-default   http://flowers-sample-pvc-predictor-default.default.svc.cluster.local   flowers-sample-pvc-predictor-default-dz2vl   flowers-sample-pvc-predictor-default-dz2vl   Unknown   IngressNotConfigured

The logs of the networking-istio deployment say:

{"level":"info","ts":"2020-03-12T19:26:51.368Z","logger":"istiocontroller.ingress-controller.event-broadcaster","caller":"record/event.go:258","msg":"Event(v1.ObjectReference{Kind:\"Ingress\", Namespace:\"default\", Name:\"flowers-sample-pvc-predictor-default\", UID:\"5f369b51-6494-11ea-9209-42010a80002d\", APIVersion:\"networking.internal.knative.dev/v1alpha1\", ResourceVersion:\"1938894\", FieldPath:\"\"}): type: 'Warning' reason: 'InternalError' failed to probe Ingress default/flowers-sample-pvc-predictor-default: failed to get Gateway \"knative-serving/knative-ingress-gateway\": gateway.networking.istio.io \"knative-ingress-gateway\" not found","commit":"6b0e5c6","knative.dev/controller":"ingress-controller"} 

In addition, I see VirtualServices created that point to a non-existent Gateway:

$ kubectl get vs

NAME                                        GATEWAYS                                                                          HOSTS                                                                                                                                                            AGE
flowers-sample-pvc-predictor-default        [knative-serving/cluster-local-gateway knative-serving/knative-ingress-gateway]   [flowers-sample-pvc-predictor-default.default flowers-sample-pvc-predictor-default.default.svc flowers-sample-pvc-predictor-default.default.svc.cluster.local]   36m
flowers-sample-pvc-predictor-default-mesh   [mesh]                                                                            [flowers-sample-pvc-predictor-default.default flowers-sample-pvc-predictor-default.default.svc flowers-sample-pvc-predictor-default.default.svc.cluster.local]   36m

I would expect the KNative Service to only be exposed to the cluster-local-gateway.

Steps to Reproduce the Problem

Start a KNative Service with the configuration (cluster-local-gateway, config-istio, config-domain) provided above.

@yanniszark yanniszark added the kind/bug Categorizes issue or PR as related to a bug. label Mar 12, 2020
@Uvindu96
Copy link

Hi, can you please let me know what is your istio version..? I don't have a cluster-local-gateway service on knative v0.13

@yanniszark
Copy link
Author

Hi @Uvindu96!
I am using Istio 1.3.1 with SDS enabled.
More specifically, this kustomization: https://github.com/kubeflow/manifests/tree/v1.0-branch/istio-1-3-1/istio-install-1-3-1/base

@yanniszark
Copy link
Author

Summarizing @tcnghia 's triaging and debugging in this Slack thread: https://knative.slack.com/archives/CA9RHBGJX/p1584101161136200

Here is the complete YAML of the created VirtualServices:

apiVersion: v1
items:
- apiVersion: networking.istio.io/v1alpha3
  kind: VirtualService
  metadata:
    annotations:
      networking.knative.dev/ingress.class: istio.ingress.networking.knative.dev
      serving.knative.dev/creator: system:serviceaccount:kubeflow:default
      serving.knative.dev/lastModifier: system:serviceaccount:kubeflow:default
    creationTimestamp: "2020-03-12T19:05:00Z"
    generation: 1
    labels:
      serving.knative.dev/route: flowers-sample-pvc-predictor-default
      serving.knative.dev/routeNamespace: default
    name: flowers-sample-pvc-predictor-default
    namespace: default
    ownerReferences:
    - apiVersion: networking.internal.knative.dev/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: Ingress
      name: flowers-sample-pvc-predictor-default
      uid: 5f369b51-6494-11ea-9209-42010a80002d
    resourceVersion: "1938889"
    selfLink: /apis/networking.istio.io/v1alpha3/namespaces/default/virtualservices/flowers-sample-pvc-predictor-default
    uid: 5f3ac797-6494-11ea-9209-42010a80002d
  spec:
    gateways:
    - knative-serving/cluster-local-gateway
    - knative-serving/knative-ingress-gateway
    hosts:
    - flowers-sample-pvc-predictor-default.default
    - flowers-sample-pvc-predictor-default.default.svc
    - flowers-sample-pvc-predictor-default.default.svc.cluster.local
    http:
    - headers:
        request:
          add:
            K-Network-Hash: cd6e79ef2641adf5255f319d214db248
      match:
      - authority:
          prefix: flowers-sample-pvc-predictor-default.default
        gateways:
        - knative-serving/cluster-local-gateway
      retries:
        attempts: 3
        perTryTimeout: 600s
      route:
      - destination:
          host: flowers-sample-pvc-predictor-default-dz2vl.default.svc.cluster.local
          port:
            number: 80
        headers:
          request:
            add:
              Knative-Serving-Namespace: default
              Knative-Serving-Revision: flowers-sample-pvc-predictor-default-dz2vl
        weight: 100
      timeout: 600s
      websocketUpgrade: true
- apiVersion: networking.istio.io/v1alpha3
  kind: VirtualService
  metadata:
    annotations:
      networking.knative.dev/ingress.class: istio.ingress.networking.knative.dev
      serving.knative.dev/creator: system:serviceaccount:kubeflow:default
      serving.knative.dev/lastModifier: system:serviceaccount:kubeflow:default
    creationTimestamp: "2020-03-12T19:05:00Z"
    generation: 1
    labels:
      serving.knative.dev/route: flowers-sample-pvc-predictor-default
      serving.knative.dev/routeNamespace: default
    name: flowers-sample-pvc-predictor-default-mesh
    namespace: default
    ownerReferences:
    - apiVersion: networking.internal.knative.dev/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: Ingress
      name: flowers-sample-pvc-predictor-default
      uid: 5f369b51-6494-11ea-9209-42010a80002d
    resourceVersion: "1938887"
    selfLink: /apis/networking.istio.io/v1alpha3/namespaces/default/virtualservices/flowers-sample-pvc-predictor-default-mesh
    uid: 5f3899a9-6494-11ea-9209-42010a80002d
  spec:
    gateways:
    - mesh
    hosts:
    - flowers-sample-pvc-predictor-default.default
    - flowers-sample-pvc-predictor-default.default.svc
    - flowers-sample-pvc-predictor-default.default.svc.cluster.local
    http:
    - headers:
        request:
          add:
            K-Network-Hash: cd6e79ef2641adf5255f319d214db248
      match:
      - authority:
          prefix: flowers-sample-pvc-predictor-default.default
        gateways:
        - mesh
      retries:
        attempts: 3
        perTryTimeout: 600s
      route:
      - destination:
          host: flowers-sample-pvc-predictor-default-dz2vl.default.svc.cluster.local
          port:
            number: 80
        headers:
          request:
            add:
              Knative-Serving-Namespace: default
              Knative-Serving-Revision: flowers-sample-pvc-predictor-default-dz2vl
        weight: 100
      timeout: 600s
      websocketUpgrade: true
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

As you can see from the first VirtualService, the VS includes the knative-serving/knative-ingress-gateway gateway BUT it then excludes it because of the section:

http:
  match:
    gateways:
    - knative-serving/cluster-local-gateway

So no traffic passes through the knative-serving/cluster-local-gateway gateway after all.
However, that causes a validation error in the networking-istio controller.
From its logs, we see:

{"level":"info","ts":"2020-03-13T15:25:19.150Z","logger":"istiocontroller.ingress-controller.event-broadcaster","caller":"record/event.go:258","msg":"Event(v1.ObjectReference{Kind:\"Ingress\", Namespace:\"default\", Name:\"flowers-sample-pvc-predictor-default\", UID:\"5f369b51-6494-11ea-9209-42010a80002d\", APIVersion:\"networking.internal.knative.dev/v1alpha1\", ResourceVersion:\"1938894\", FieldPath:\"\"}): type: 'Warning' reason: 'InternalError' failed to probe Ingress default/flowers-sample-pvc-predictor-default: failed to get Gateway \"knative-serving/knative-ingress-gateway\": gateway.networking.istio.io \"knative-ingress-gateway\" not found","commit":"6b0e5c6","knative.dev/controller":"ingress-controller"} 

A fix for this issue should make sure that the public gateway is not mentioned when handling a cluster-local service.
Thanks to @tcnghia for the triaging and debugging of this.

In the meantime, one can use a dummy knative-serving/cluster-local-gateway Gateway to bypass this issue.

@yanniszark
Copy link
Author

While looking at the Ingress created for the given KService, I noticed something weird:

  rules: 
  - hosts: 
    - flowers-sample-pvc-predictor-default.default.svc.cluster.local 
    http: 
      paths: 
      - retries: 
          attempts: 3 
          perTryTimeout: 10m0s 
        splits: 
        - appendHeaders: 
            Knative-Serving-Namespace: default 
            Knative-Serving-Revision: flowers-sample-pvc-predictor-default-zhdws 
          percent: 100 
          serviceName: flowers-sample-pvc-predictor-default-zhdws 
          serviceNamespace: default 
          servicePort: 80 
        timeout: 10m0s 
    visibility: ExternalIP 
  visibility: ExternalIP 

Visibility is ExternalIP while it should be ClusterLocal.
This is a seperate bug from the one mentioned above.
All in all, there are two bugs:

  1. Don't add the public ingressgateway for clusterlocal services.
  2. Clusterlocal service detection.

@yanniszark yanniszark changed the title Cluster Local Services also exposed on Public Gateway Cluster Local KService generates ExternalIP Ingress Mar 13, 2020
@yanniszark
Copy link
Author

I opened knative-extensions/net-istio#44 to track the VirtualService generation bug.
Let's keep this issue for the bug which makes a Cluster Local KService generate an ExternalIP Ingress, instead of a ClusterLocal one.

@zhanggbj
Copy link

zhanggbj commented Mar 23, 2020

@yanniszark

After upgrade to Knative 0.13, this is fixed as some changes in Ingress hosts. Hope it helps. refer to #7264

We run into a similar situation (Knative 0.12)with both enabled knative-ingress-gateway and cluster-local-gateway, but the services are failing with error IngressNotConfigured.
This is because Knative Probe is tring local service against the public gateway address.

@tcnghia
Copy link
Contributor

tcnghia commented Mar 24, 2020

/assign @tcnghia

@tcnghia
Copy link
Contributor

tcnghia commented Apr 13, 2020

/assign @richterdavid

@shreejad
Copy link
Contributor

/assign @shreejad

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 22, 2020
@shreejad
Copy link
Contributor

shreejad commented Oct 13, 2020

Currently, creating a cluster local Kservice results in the following Kingress:

spec:
  rules:
  - hosts:
    - testbug.default.svc.cluster.local
    http:
      paths:
      - splits:
        - appendHeaders:
            Knative-Serving-Namespace: default
            Knative-Serving-Revision: testbug-h84gc
          percent: 100
          serviceName: testbug-h84gc
          serviceNamespace: default
          servicePort: 80
        timeout: 48h0m0s
    visibility: ClusterLocal
  visibility: ExternalIP

There are 2 "visibility" fields - ".spec.visibility" and ".spec.rules[0].visibility"

I think #6732 fixed the issue of ".spec.rules[0].visibility" being "ExternalIP". Now it is "ClusterLocal" as seen in the above YAML.

I'm not sure if the ".spec.visibility" field should be "ClusterLocal". In Kservices which are externally exposed, the ".spec.visibility" field does not exist.
@yanniszark, @ZhiminXiang : Do you know what the intended behavior for the ".spec.visibility" field is?

@shreejad
Copy link
Contributor

/remove-lifecycle stale

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 13, 2020
@shreejad
Copy link
Contributor

According to knative/networking#129, the .spec.visibility field should be deprecated. Will investigate more about why it is still visible in cluster local services.

@shreejad
Copy link
Contributor

The ".spec.visibility" field in KIngress has been removed in Knative version 0.18. The cluster I previously tested on was 0.17, which is why the field was still showing up. Closing this bug as it is no longer present in the latest Knative version.

@shreejad
Copy link
Contributor

/close

@knative-prow-robot
Copy link
Contributor

@shreejad: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

7 participants