Skip to content
This repository has been archived by the owner on Nov 2, 2024. It is now read-only.

Broken cartservice - busybox nslookup misreturns 1 instead of 0 #995

Closed
palladius opened this issue Jan 10, 2023 · 6 comments
Closed

Broken cartservice - busybox nslookup misreturns 1 instead of 0 #995

palladius opened this issue Jan 10, 2023 · 6 comments
Assignees

Comments

@palladius
Copy link

urrently lastest version of https://cloud-ops-sandbox.dev/ is broken.

A fresh isntall fails on the cartservice.

I've done a long investigation with @alml in [1] shared with leoy@

A quick/cheap fix would be good enough.

[1] https://docs.google.com/document/d/1RTEKaDlP9PwoNfKvpAFxjbQYCZ0o5kA9Pj_V26kqu3Y/edit# [2]

@palladius
Copy link
Author

From Leonid:

The investigation is still in progress. So far I can confirm that the cause of the problem is init container in the cartservice pod that fails. The workaround of the problem is to delete the cartservice deployment. Ensure that redis-cart deployment and service are in ready state. Delete the initContainers section from the cartservice.yaml (in the kubernetes-manifests/ folder and re-deploy the cartservice

@palladius
Copy link
Author

the part which needs to be removed is

 initContainers:
      - command:
        - bin/sh
        - -c
        - until nslookup redis-cart; do echo waiting for redis; sleep 2; done;
        image: busybox
        imagePullPolicy: Always
        name: init-redis-ready

@palladius
Copy link
Author

Alex and I noticed that this command returns correctly on main container but poorly on the init container:

Server:		10.28.0.10
Address:	10.28.0.10:53

Non-authoritative answer:
Name:	redis-cart.default.svc.cluster.local
Address: 10.28.2.181

** server can't find redis-cart.svc.cluster.local: NXDOMAIN

** server can't find redis-cart.cluster.local: NXDOMAIN

** server can't find redis-cart.cluster.local: NXDOMAIN

** server can't find redis-cart.svc.cluster.local: NXDOMAIN

** server can't find redis-cart.google.internal: NXDOMAIN

** server can't find redis-cart.google.internal: NXDOMAIN

** server can't find redis-cart.c.cloud-ops-sandbox-2646743255.internal: NXDOMAIN

** server can't find redis-cart.c.cloud-ops-sandbox-2646743255.internal: NXDOMAIN

/app # echo $?
0

It would incorrectly return 1 on the init (where the SHELL env was slightly different, maybe a differen versioj busybox? Leonid suggests it might be a bug in busybox and I agree.

@palladius palladius changed the title Broken cartservice - nslookup misreturns 1 when it shoudl return 0 Broken cartservice - busybox nslookup misreturns 1 instead of 0 Jan 10, 2023
@palladius
Copy link
Author

I can confirm this change works:

  initContainers:
        - name: init-redis-ready-riccardo
          # There is a bug in busybox that prevents us from returning 0 when redis is available and multiple addresses are in /etc/resolv.conf :/
          image: busybox
          command: ['bin/sh', '-c', 'until nslookup redis-cart|grep Address: ; do echo Waiting for redis BUG in busybox; sleep 2; done;']
          #command: ['bin/sh', '-c', 'echo OK Ric04 just ok']
      containers:

palladius added a commit to palladius/cloud-ops-sandbox that referenced this issue Jan 10, 2023
@palladius
Copy link
Author

I'll try now also the 1.28 version as per here: https://www.linkedin.com/pulse/busybox-nslookup-bug-gary-tay/

@palladius
Copy link
Author

YES! The

    - name: init-redis-ready-riccardo128
      # There is a bug in busybox that prevents us from returning 0 when redis is available and multiple addresses are in /etc/resolv.conf
      image: busybox:1.28
      #command: ['bin/sh', '-c', 'until nslookup redis-cart|grep Address: ; do echo Waiting for redis BUG in busybox; sleep 2; done;']
      command: ['bin/sh', '-c', 'until nslookup redis-cart ; do echo Waiting for redis BUG in busybox; sleep 2; done;']

also works.

@losalex losalex assigned minherz and unassigned daniel-sanche Jan 11, 2023
minherz added a commit that referenced this issue Jan 11, 2023
Fixes #995 by pinning busybox image to version 1.28
minherz added a commit that referenced this issue Jan 11, 2023
Fixes #995 by pinning busybox image to version 1.28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants