Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify TCP and UDP DNS interception flows #5392
Unify TCP and UDP DNS interception flows #5392
Changes from all commits
9e846db
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any plan to make the meter rate configurable? A concern is the DNS response packets may be dropped by meter if a burst of Pods traffic happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update #5358 when this is merged.
Note that based on some recent experiments I did, CPU usage increases very quickly when the rate is increased. There is potentially some improvements that can be made in the agent to process packets more efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned if this could affect normal traffic. A node could run 110 Pods, if all of them use FQDN rule, each Pod is only allowed to get less than 1 DNS response per second. But even for in-cluster DNS UDP queries, it usually needs 3~5 rounds to get a valid result due the search names. If it's a DNS TCP query, it could also be several responses. With the rate-limiting, I think it's very easy to hit the limit and cause applications can't resolve domains. Even the networkpolicy e2e test could cause tens of DNS responses per second according to my experience.
I wonder which process is more expensive: the packedin, or the userspace code that processes the packet. If it's latter, I wonder if we should use the previous way that uses a channel to rate limit the packets, and resume the transmission of the ones that fail to be queued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't understand the channel story. I know we already have a rate limit channel for packetin. What channel do you propose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a similar concern.
I feel like we need more investigation and experimenting. Just getting the packets to the antrea-agent handling code is quite expensive in terms of CPU time, so I am hoping we can improve that.
We could use a larger rate. However, I am pretty sure that this patch has no effect on the maximum rate of DNS packets we can handle. The reason is that we also have a rate-limited queue in the antrea-agent process, and each PacketIn category is rate-limited to 100 pps:
antrea/pkg/agent/openflow/packetin.go
Lines 113 to 118 in e04c95c
DNS responses use the
PacketInCategoryDNS
category.The OVS meter should prevent high CPU usage in ovs-vswitchd and antrea-agent, but there should be no change in the effective rate of DNS responses we can process.
I suggest that after merging it, we revisit these parameters. We could have a higher OVS rate for DNS packets, or at least a higher tolerance for bursts. At the moment I think the OVS meter has a burst size of 200 packets, and the software queue also has capacity 200.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems 100 or 200 pps is far from enough. I did a statistic when analyzing a pcap got from a production cluster, even a single Pod's DNS response could exceed 100:
The above pcap has only UDP DNS traffic, the maximum value around 100 reminds me if it's already limited by the userspace rate limiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the userspace rate-limiting was always there, let's merge this PR, and work on increasing the limit as a follow-up. @GraysonWu could you also work on this? I think settings it to a default of 500pps for DNS packets and making the rate configurable would be a good place to start. I think you should also run some experiments on a cluster to see how much CPU is consumed at 500pps. Based on some previous experiments I did, 500pps can consume one full vCPU (ovs-vswitchd + antrea-agent). Maybe there is some room to improve the implementation. Of course, in the long term, we could always consider a different implementation than packet-in (e.g. similar to what we do with Suricata for IDS).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will work on that and open a PR ASAP.
I did an experiment with 500 pps it takes around 70% of 1 vCPU in my env.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.