Skip to content
This repository has been archived by the owner on Feb 9, 2024. It is now read-only.

Implement check for resolving localhost #1046

Closed
aelkugia opened this issue Jan 25, 2020 · 4 comments
Closed

Implement check for resolving localhost #1046

aelkugia opened this issue Jan 25, 2020 · 4 comments
Labels
kind/enhancement New feature or request port/5.5 Requires port to version/5.5.x port/6.1 Requires port to version/6.1.x port/6.3 Requires port to version/6.3.x priority/1 Medium priority support-load Mark issues that increase support load

Comments

@aelkugia
Copy link
Contributor

Describe the error

Customer reported the watcher container in the influxdb pod is constantly failing because it couldn't connect to influxdb as expected. After upgrading from 1.4.5 to 1.4.54, started to experience issues with InfluxDB. The pod kept restarting with status CrashLoopBackOff.

Cause of the error

The error was caused by a bad localhost resolution to an invalid ip which is not 127.0.0.1.

Expected behavior

A precheck should exist for localhost to resolve correctly in the cluster. This will ensure user fixes dns before continuing with installation.

@aelkugia
Copy link
Contributor Author

It was found that resolving from the dns server instead of /etc/hosts/is the Go behavior if cannot find /etc/nsswitch file in the container.

There is a fix expected in Go 1.15 release. Discussion can be found here: golang/go#35305

@terrywang
Copy link

Just to correct, it's the /etc/nsswitch.conf file provided by glibc package on Fedora / EL and libc-bin on Debian/Ubuntu.

NOTE: On Arch Linux and its variants (e.g. Manjaro) that I use as workstation, the file is provided by filesystem.

@r0mant r0mant changed the title InfluxDB failed to start (CrashLoopBackOff) - improve pre-check Implement check for resolving localhost Feb 19, 2020
@r0mant r0mant added port/5.5 Requires port to version/5.5.x port/6.1 Requires port to version/6.1.x port/6.3 Requires port to version/6.3.x kind/enhancement New feature or request priority/1 Medium priority support-load Mark issues that increase support load labels Feb 19, 2020
@knisbet
Copy link
Contributor

knisbet commented Feb 28, 2020

It might be possible to just hardcode localhost in the coredns configuration. Depends a bit on whether the query is for localhost or localhost..

@knisbet
Copy link
Contributor

knisbet commented Mar 6, 2020

This is fixed in 5.5.x by #1214 / gravitational/monitoring-app#145 by fixing the cause of leaking localhost queries, as opposed to implementing a check for a capturing DNS resolver which is a valid customer deployment option.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/enhancement New feature or request port/5.5 Requires port to version/5.5.x port/6.1 Requires port to version/6.1.x port/6.3 Requires port to version/6.3.x priority/1 Medium priority support-load Mark issues that increase support load
Projects
None yet
Development

No branches or pull requests

5 participants