Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP connection keep-alive timeout does not work as expected. #7590

Closed
gabrielpagu opened this issue Sep 3, 2021 · 10 comments
Closed

TCP connection keep-alive timeout does not work as expected. #7590

gabrielpagu opened this issue Sep 3, 2021 · 10 comments
Assignees
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@gabrielpagu
Copy link

gabrielpagu commented Sep 3, 2021

NGINX Ingress controller version:
v1.0,0

Kubernetes version (use kubectl version):
1,21

Environment:
all

  • Cloud provider or hardware configuration:
    Microsoft - AKS

  • OS (e.g. from /etc/os-release):

NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.11.6
PRETTY_NAME="Alpine Linux v3.11"

  • Kernel (e.g. uname -a):

Linux main-ingress-nginx-controller-85b94b568f-6k275 5.4.0-1049-azure #51~18.04.1-Ubuntu SMP Fri Jun 4 15:21:28 UTC 2021 x86_64 Linu

  • Install tools:
    Azure Kubernetes Service deployed trough ARM templates

  • Basic cluster related info:

    • kubectl version

D:\WORK\nginx>kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.3", GitCommit:"ca643a4d1f7bfe34773c74f79527be4afd95bf39", GitTreeState:"clean", BuildDate:"2021-07-15T21:04:39Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"bd9d9ae2719e67a582f48c5dc2f81e87fe1deb8a", GitTreeState:"clean", BuildDate:"2021-08-16T22:02:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

  • kubectl get nodes -o wide

kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE
KERNEL-VERSION CONTAINER-RUNTIME
aks-llrga-35449606-vmss0000wy Ready agent 38d v1.19.11 10.192.8.215 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
aks-llrga-35449606-vmss0000x0 Ready agent 38d v1.19.11 10.192.11.200 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
aks-llrga-35449606-vmss0000z1 Ready agent 3d20h v1.19.11 10.192.12.195 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
aks-llrga-35449606-vmss0000zc Ready agent 2d4h v1.19.11 10.192.9.210 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
aks-llrga-35449606-vmss0000zr Ready agent 4h20m v1.19.11 10.192.10.205 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
aks-sys-35449606-vmss000000 Ready agent 40d v1.19.11 10.192.0.4 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
aks-sys-35449606-vmss000001 Ready agent 40d v1.19.11 10.192.0.255 Ubuntu 18.04.5 LTS 5.4.0-1049-azure containerd://1.4.4+azure
akswmeda0000il Ready agent 40d v1.19.11 10.192.5.41 Windows Server 2019 Datacenter 10.0.17763.2061 docker://20.10.6
akswmeda0000mp Ready agent 28d v1.19.11 10.192.1.250 Windows Server 2019 Datacenter 10.0.17763.2061 docker://20.10.6
akswmeda0000n4 Ready agent 22d v1.19.11 10.192.3.240 Windows Server 2019 Datacenter 10.0.17763.2061 docker://20.10.6
akswmeda0000q6 Ready agent 7d17h v1.19.11 10.192.2.245 Windows Server 2019 Datacenter 10.0.17763.2061 docker://20.10.6
akswmeda0000rv Ready agent 45h v1.19.11 10.192.6.225 Windows Server 2019 Datacenter 10.0.17763.2061 docker://20.10.6
akswmeda0000s2 Ready agent 5h2m v1.19.11 10.192.4.235 Windows Server 2019 Datacenter 10.0.17763.2061 docker://20.10.6

  • How was the ingress-nginx-controller installed:
    • If helm was used then please show output of helm ls -A
      helm ls -A
      NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
      main ingress 1 2020-10-01 11:00:18.8095492 +0300 EEST deployed ingress-nginx-3.3.0 0.35.0
      xtelpromoformulaapi multitenant-services-to 86 2021-08-19 10:54:54.128956205 +0000 UTC deployed xtel-promoformula-api-chart-1.0.122 latest

    • If helm was used then please show output of helm -n <ingresscontrollernamepspace> get values <helmreleasename>

helm -n ingress get values main
USER-SUPPLIED VALUES:
controller:
admissionWebhooks:
enabled: false
metrics:
enabled: true
nodeSelector:
x-node-size: system-small
replicaCount: 2
service:
loadBalancerIP: 20.54.239.152
defaultBackend:
enabled: true
nodeSelector:
x-node-size: system-small

  • Current State of the controller:
    • kubectl -n <ingresscontrollernamespace> get all -A -o wide

kubectl -n ingress get all -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/main-ingress-nginx-controller-85b94b568f-6k275 1/1 Running 0 41d 10.192.1.25 aks-sys-35449606-vmss000001
pod/main-ingress-nginx-controller-85b94b568f-mlwt4 1/1 Running 0 41d 10.192.0.119 aks-sys-35449606-vmss000000
pod/main-ingress-nginx-defaultbackend-7847b5d56b-bqksl 1/1 Running 0 41d 10.192.0.41 aks-sys-35449606-vmss000000

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/main-ingress-nginx-controller LoadBalancer 10.0.185.227 20.54.239.152 80:31840/TCP,443:30837/TCP 337d app.kubernetes.io/component=controller,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx
service/main-ingress-nginx-controller-metrics ClusterIP 10.0.67.93 9913/TCP 337d app.kubernetes.io/component=controller,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx
service/main-ingress-nginx-defaultbackend ClusterIP 10.0.246.108 80/TCP 337d app.kubernetes.io/component=default-backend,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx

NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/main-ingress-nginx-controller 2/2 2 2 337d controller k8s.gcr.io/ingress-nginx/controller:v0.35.0@sha256:fc4979d8b8443a831c9789b5155cded454cb7de737a8b727bc2ba0106d2eae8b app.kubernetes.io/component=controller,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx
deployment.apps/main-ingress-nginx-defaultbackend 1/1 1 1 337d ingress-nginx-default-backend k8s.gcr.io/defaultbackend-amd64:1.5 app.kubernetes.io/component=default-backend,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx

NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/main-ingress-nginx-controller-85b94b568f 2 2 2 337d controller k8s.gcr.io/ingress-nginx/controller:v0.35.0@sha256:fc4979d8b8443a831c9789b5155cded454cb7de737a8b727bc2ba0106d2eae8b app.kubernetes.io/component=controller,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx,pod-template-hash=85b94b568f
replicaset.apps/main-ingress-nginx-defaultbackend-7847b5d56b 1 1 1 337d ingress-nginx-default-backend k8s.gcr.io/defaultbackend-amd64:1.5 app.kubernetes.io/component=default-backend,app.kubernetes.io/instance=main,app.kubernetes.io/name=ingress-nginx,pod-template-hash=7847b5d56b

What happened:

TCP connection keep-alive timeout does not work as expected.
Configuring a keep alive timeout of 0 disabled completly the keep-alive of connections, which is expected.
Configuring a keep alive timeout different than 0 results in a keep-alive timeot of 60seconds no matter the value used. Which is unexpected.

What you expected to happen:

When we configure a TCP connection time-out keep-alive different than 0 that value should be used.

How to reproduce it:

Apply the following config map :
apiVersion: v1
data:
enable-underscores-in-headers: "true"
keep-alive: '0'
kind: ConfigMap
metadata:
name: main-ingress-nginx-controller
namespace: ingress

Run from a linux lient the following script:
start=date +%s
echo -e "GET / HTTP/1.1\r\nHost:kc-emea-aks-xtel-dev-02.salesperformanceplatform.com\r\n\r\n" | openssl s_client -connect kc-emea-aks-xtel-dev-02.salesperformanceplatform.com:443 -servername kc-emea-aks-xtel-dev-02.salesperformanceplatform.com -ign_eof
end=date +%s
runtime=$((end-start))
echo "KeepAlive timeout is around $runtime seconds"
echo "done"

Result: KeepAlive timeout is around 0 seconds

Apply the following config map :
apiVersion: v1
data:
enable-underscores-in-headers: "true"
keep-alive: '300'
kind: ConfigMap
metadata:
name: main-ingress-nginx-controller
namespace: ingress

Run from a linux lient the following script:
start=date +%s
echo -e "GET / HTTP/1.1\r\nHost:kc-emea-aks-xtel-dev-02.salesperformanceplatform.com\r\n\r\n" | openssl s_client -connect kc-emea-aks-xtel-dev-02.salesperformanceplatform.com:443 -servername kc-emea-aks-xtel-dev-02.salesperformanceplatform.com -ign_eof
end=date +%s
runtime=$((end-start))
echo "KeepAlive timeout is around $runtime seconds"
echo "done"

Result: KeepAlive timeout is around 60 seconds

Analysing the nginx.conf file inside the controllers, the value changes to the value set in the config map. But the nginx server seems to ignore any value different than 0.

/kind bug

@gabrielpagu gabrielpagu added the kind/bug Categorizes issue or PR as related to a bug. label Sep 3, 2021
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 3, 2021
@k8s-ci-robot
Copy link
Contributor

@gabrielpagu: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@longwuyuan
Copy link
Contributor

longwuyuan commented Sep 16, 2021

/remove-kind bug
/kind support
/triage needs-information

this looks like a old release from this project ;

ingress-nginx-3.3.0 0.35.0

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. and removed kind/bug Categorizes issue or PR as related to a bug. labels Sep 16, 2021
@gabrielpagu
Copy link
Author

Hello,

I've tried also v1.0.0 and I could reproduce .

@longwuyuan
Copy link
Contributor

longwuyuan commented Sep 20, 2021 via email

@carolize
Copy link

/priority important-longterm
/remove-triage needs-information
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority labels Oct 12, 2021
@k8s-ci-robot
Copy link
Contributor

@carolize: The label triage/accepted cannot be applied. Only GitHub organization members can add the label.

In response to this:

/priority important-longterm
/remove-triage needs-information
/triage accepted

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Oct 12, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 10, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 9, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

5 participants