Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continue serving DNS even when cluster is offline #117

Closed
onedr0p opened this issue Jul 7, 2022 · 8 comments · May be fixed by #123
Closed

Continue serving DNS even when cluster is offline #117

onedr0p opened this issue Jul 7, 2022 · 8 comments · May be fixed by #123

Comments

@onedr0p
Copy link
Contributor

onedr0p commented Jul 7, 2022

Hi 👋🏼

I am using k8s_gateway with the following config on Opnsense. I use this instead of Unbound and dnsmasq that is provided by Opnsense. So for example if my cluster is offline, k8s_gateway won't start due to that.

I would hope it's possible to change this behavior but maybe this already works and my configuration is wrong?

(common) {
  bind 127.0.0.1 ::1
  errors
  log
  reload
  loadbalance
  cache 300
  loop
  local
  prometheus 192.168.1.1:9153
}

. {
  import common
  k8s_gateway cluster-domain.com {
    resources Ingress
    ttl 1
    kubeconfig /usr/local/etc/coredns/kubeconfig
    fallthrough
  }
  forward . tls://1.1.1.1 tls://1.0.0.1 {
    tls_servername cloudflare-dns.com
  }
}

non-cluster-domain.com {
  import common
  k8s_gateway . {
    resources Ingress
    ttl 30
    kubeconfig /usr/local/etc/coredns/kubeconfig
  }
}
@networkop
Copy link
Collaborator

hey @onedr0p ,
have you considered the coredns's cache plugin with a high TTL value?

@onedr0p
Copy link
Contributor Author

onedr0p commented Jul 11, 2022

You can see in my config I am using the cache plugin. tl;dr, the issue is that if your cluster is offline, k8s_gateway fails to start. Try to start k8s_gateway with a kubeconfig pointing to an IP:port not serving a cluster.

@networkop
Copy link
Collaborator

ah yeah, I think failing to start is kinda expected. If the plugin can't reach the API server, it can't discover k8s resources, so there's no point in coming up. This way kubelet will continue to restart it until the connectivity to the API server is restored.
I understand you run it outside of kubelet. What's your expected mode of operation?

@onedr0p
Copy link
Contributor Author

onedr0p commented Jul 11, 2022

I would hope it could start and warn about not reaching the cluster while still serving DNS for everything else in the config, and hopefully start working if the cluster did come online without restarting k8s_gateway.

@networkop
Copy link
Collaborator

I think it should be possible. What do you see happening now? Can you collect the logs with the debug enabled? https://coredns.io/plugins/debug/

@onedr0p
Copy link
Contributor Author

onedr0p commented Jul 12, 2022

This is the error I get on k8s_gateway startup, it's very easy to replicate.

[INFO] plugin/k8s_gateway: Building k8s_gateway controller
panic: Get "https://192.168.1.2:6443/apis/gateway.networking.k8s.io/v1alpha2/gateways": dial tcp 192.168.1.2:6443: connect: connection refused

goroutine 1 [running]:
github.com/ori-edge/k8s_gateway.handleCRDCheckError({0x22b3ce0, 0xc0007800f0}, {0x1eb2d04, 0xa}, {0x1ecdfc5, 0x19})
	/home/runner/work/k8s_gateway/k8s_gateway/kubernetes.go:208 +0x2b3
github.com/ori-edge/k8s_gateway.existGatewayCRDs({0x22cb590, 0xc00019c0c8}, 0x1eaea37?)
	/home/runner/work/k8s_gateway/k8s_gateway/kubernetes.go:190 +0xac
github.com/ori-edge/k8s_gateway.newKubeController({0x22cb590, 0xc00019c0c8}, 0xc000454300, 0xc0000936c0, 0xc00019f8d8)
	/home/runner/work/k8s_gateway/k8s_gateway/kubernetes.go:57 +0x14d
github.com/ori-edge/k8s_gateway.(*Gateway).RunKubeController(0xc0000fcc00, {0x22cb590, 0xc00019c0c8})
	/home/runner/work/k8s_gateway/k8s_gateway/kubernetes.go:180 +0xa5
github.com/ori-edge/k8s_gateway.setup(0x1eaa956?)
	/home/runner/work/k8s_gateway/k8s_gateway/setup.go:29 +0x10d
github.com/coredns/caddy.executeDirectives(0xc0003c5200, {0x7fffffffed0e, 0x1f}, {0xc000508c00, 0x30, 0x203000?}, {0xc000471a00, 0x2, 0x8?}, 0x0)
	/home/runner/go/pkg/mod/github.com/coredns/caddy@v1.1.1/caddy.go:661 +0x5d6
github.com/coredns/caddy.ValidateAndExecuteDirectives({0x22caa68?, 0xc0004719c0}, 0x8?, 0x0)
	/home/runner/go/pkg/mod/github.com/coredns/caddy@v1.1.1/caddy.go:612 +0x3e5
github.com/coredns/caddy.startWithListenerFds({0x22caa68, 0xc0004719c0}, 0xc0003c5200, 0x0)
	/home/runner/go/pkg/mod/github.com/coredns/caddy@v1.1.1/caddy.go:515 +0x26f
github.com/coredns/caddy.Start({0x22caa68, 0xc0004719c0})
	/home/runner/go/pkg/mod/github.com/coredns/caddy@v1.1.1/caddy.go:472 +0xe5
github.com/coredns/coredns/coremain.Run()
	/home/runner/go/pkg/mod/github.com/coredns/coredns@v1.9.1/coremain/run.go:63 +0x1cd
main.main()
	/home/runner/work/k8s_gateway/k8s_gateway/cmd/coredns.go:44 +0x95

@networkop
Copy link
Collaborator

thanks, this looks like a bug, I wasn't expecting this kind of behaviour. Anyhow, I'll try to cook something up over the weekend.

@onedr0p
Copy link
Contributor Author

onedr0p commented Oct 11, 2023

I don't know if this is an issue anymore, but in any case I no longer use k8s_gateway. Closing issue...

@onedr0p onedr0p closed this as completed Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants