-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coredns proposal #1100
Add coredns proposal #1100
Conversation
johnbelamaric
commented
Sep 19, 2017
@kubernetes/sig-network-proposals |
I'm OK with this. Any net change in footprint - CPU or memory required, performance or scale characteristics? |
Seems reasonable to me as well. Also want to see some perf diff info - tail latency on representative workloads. |
Also add to the proposal security implications (I see some interesting things in the current codebase) for @kubernetes/sig-auth-api-reviews |
On a 6.5k service / 10k pod cluster i was seeing about 18-35mb in use, and on a 15k service / 18k pod cluster I was seeing slightly higher. No obvious allocations or slow paths on a range of queries. It seems to fare a bit better than most projects that integrate with kube :) Would be nice if it upgraded to protobuf |
* Adding an arbitrary entry inside the cluster domain (for example TXT entries [#38](https://github.com/kubernetes/dns/issues/38)) | ||
* Verified pod DNS entries (ensure pod exists in specified namespace) | ||
* Experimental server-side search path to address latency issues [#33554](https://github.com/kubernetes/kubernetes/issues/33554) | ||
* Limit PTR replies to the cluster CIDR [#125](https://github.com/kubernetes/dns/issues/125) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are PTR records for services implemented? I saw coredns/coredns#1074
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are. You have to configure the reverse zone to make it work. That means knowing the service CIDR and configuring that ahead of time (would love to have kubernetes/kubernetes#25533 implemented).
Since reverse DNS zones are on classful boundaries, if you have a classless CIDR for your service CIDR (say, a /12), then you have to widen that to the containing classful network. That leaves a subset of that network open to the spoofing described in kubernetes/dns#125, and so the issue you reference is to fix that.
We still have that issue (PTR hijacking) with CoreDNS for IPs in the pod CIDRs, but we are in a position to fix it if the operator is willing to enable pods verified
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally recommend production user disable pod IP DNS for this reason as well. I would prefer to let pod DNS get deprecated out since it was intentionally a stop gap.
Thanks for the clarity on the reverse CIDR. I think one part of this that would be good would be to include a sample Corefile in this proposal that implements conformance with the Kube DNS spec. To someone new to the core file syntax but deeply familiar with the kube dns spec I had to dig through the code to know what I had to set up. A core file would go a long way to assisting in understanding the implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have added that. Let me know if I need any more examples (e.g., federation).
|
||
By default, the user experience would be unchanged. For more advanced uses, existing users would need to modify the | ||
ConfigMap that contains the CoreDNS configuration file. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discuss operational characteristics of coredns against existing kube-dns.
Have the existing e2e tests been run against a cluster configured with coredns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will add some to the proposal regarding the operational characteristics.
Existing e2e tests have not been run yet, see coredns/coredns#993 - we plan on this for sure.
@smarterclayton your comment "would be nice if it upgraded to protobuf" - can you explain further? We actually can answer gRPC queries, which means protobuf over http/2 and TLS. But of course that requires the client to use that. Right now that is a simple |
No, I mean that you should use a version of the kubernetes client with the protobuf serialization. That reduces server load significantly. Not a requirement for this, but in the performance / scale section it's worth mentioning. I know some people run kube DNS on many many nodes - openshift runs it on every node and so the impact of performance on the cluster is key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I review this, is it possible to get graphs on CPU/Mem usage + 50/90/95/98/99%-th percentile graphs for reponse time compared to kube-dns in 100/1000/5000-node clusters?
cc @kubernetes/sig-scalability-feature-requests (we'd probably have to use kubemark)
Also, at what point is more than one replica needed when scaling up the cluster?
It would be really interesting to see that (just reiterating what has been said above)
@luxas Even with what we get as a CNCF project from Packet, I don't have the resources to do 1000+ node clusters. Are there some other resources available for this sort of test? |
Oh, I see. kubemark simulates clusters. That's new to me. We'll look into it. |
Examples are great. I'll have someone poke at coredns against our super dense clusters to get a rough idea of the outcome. |
Kubemark is generally setting cluster control plane (apiserver, etcd, controller manager, ...) + a number of "fake nodes" (those nodes pretend to be real nodes, but they are not starting real containers, mount volumes, etc. - they just fake those operations). But from apiserver perspective they behave as real nodes (e.g. they confirm that a pod is running, or has been killed etc.) So once you have kubemark setup, you can run any test or workflow against it. In particular we are running some of load tests against it: With 5000-node kubemark cluster, our main test is creating ~16k services, with 150.000 pods (each of them being part of some service). But you can run any test you want against the kubemark control plane. |
Thanks @wojtek-t , that's helpful. I'll try to catch up with @rajansandeep to see where he is with this. |
Performance-test for CoreDNS in KubernetesThe performance test was done in GCE with the following components:
CoreDNS and client are running out-of-cluster (due to it being a Kubemark cluster) The following is the summary of the performance of CoreDNS.
*We simulated service change load by creating and destroying 1% of services per minute. |
FYI, latency is so low and the same across all, we are double checking to make sure the test is valid. |
For reference, what are these numbers for kube-dns? That might be a valuable comparision to perform, possibly encouraging the broader community to switch over sooner rather than later... |
We're getting real responses with that low latency. Probably the two VMs (coredns and client) are on the same physical host or something. Although I imagine latency between GCE hosts is extremely low. Anyway, it can be taken to mean that:
As an aside, before we merged coredns/coredns#1149, the latency on the 10k services was really high (like 1 second). |
@luxas we can do that. The numbers above are uncached. So, we would need to configure For general |
Ok, I think this latest commit covers all the open questions, except running the equivalent kube-dns numbers (which I don't think necessarily belongs here, but we can add it if others wish). |
Performance-test for Kubedns + dnsmasq in KubernetesThe performance test was done in GCE with the following components:
Kubedns + dnsmasq and client are running out-of-cluster (due to it being a Kubemark cluster) The following is the summary of the performance of Kubedns + dnsmasq.
*We simulated service change load by creating and destroying 1% of services per minute. |
Really encouraging numbers - I'm seeing similar numbers from large clusters
that I am testing against.
…On Fri, Oct 20, 2017 at 11:41 PM, Sandeep Rajan ***@***.***> wrote:
Performance-test for Kubedns + dnsmasq in Kubernetes
The performance test was done in GCE with the following components:
- Kubedns + dnsmasq system with machine type : n1-standard-1 ( 1 CPU,
2.3 GHz Intel Xeon E5 v3 (Haswell))
- Client system with machine type: n1-standard-1 ( 1 CPU, 2.3 GHz
Intel Xeon E5 v3 (Haswell))
- Kubemark Cluster with 5000 nodes
Kubedns + dnsmasq and client are running out-of-cluster (due to it being a
Kubemark cluster)
The following is the summary of the performance of Kubedns + dnsmasq.
Cache was disabled.
Services (with 1% change per minute*) Max QPS** Latency (Median) Kubedns
+ dnsmasq memory (at max QPS) Kubedns + dnsmasq CPU (at max QPS)
1,000 8,000 0.2 ms 45 MB 85 %
5,000 7,000 0.2 ms 97 MB 89 %
10,000 6,000 0.2 ms 191 MB 81 %
*We simulated service change load by creating and destroying 1% of
services per minute.
** Max QPS with < 1 % packet loss
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1100 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p8zi76Etl9GNSbhe1ePXZ9v1PvmTks5suRN7gaJpZM4PdDS6>
.
|
/lgtm for alpha For beta, we would want to have migration story for existing users of kube-dns. |
Automatic merge from submit-queue. |
👍 |
Automatic merge from submit-queue. Add coredns proposal