-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document the limitations of Audit Logging for policy rules #6225
Document the limitations of Audit Logging for policy rules #6225
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just three minor questions.
would drastically restrict the number of possible DNS requests in the cluster, | ||
which in turn would cause a lot of errors in applications which rely on DNS: | ||
|
||
```yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more helpful to add a link to ACNP with log settings example here for comparison? And maybe change that previous example to a more suitable example to log (Drop
), as opposed to this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I am having a hard time understanding what you are suggesting. What difference are we trying to highlight by comparing with the acnp-with-log-setting
example? And which example would you want to update to use the Drop
action (and why)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that acnp-with-log-setting
example does not suffer much from disrupting application workloads, rather than this allow-dns
example, so I was referring to this difference.
I was wondering if we need to change acnp-with-log-setting
to Drop
as mentioned below especially when the policy rule uses the Allow
action. (Quick question is it no longer the case that only the first packet of an Allow connection will be logged?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that acnp-with-log-setting example does not suffer much from disrupting application workloads, rather than this allow-dns example, so I was referring to this difference.
This policy applies to traffic from the application frontend to the DB layer. If it is a large scale application with a high number of connections, it could suffer from the same issue. In practice, for this specific case, the number of connections between the frontend and the DB is likely to stay "small", as the application is likely to use a connection pool instead of creating a new connection for each user session. But that really depends on the type of application.
The difference between these policies is more about the workloads to which they apply. I don't think there is a difference in how they are implemented. So they can both suffer from this issue in the same way.
In practice, users are more likely to experience this issue with CoreDNS because: 1) it seems common to enable logging for DNS requests, 2) even medium clusters usually have a large volume of DNS requests.
So I may still be missing your point.
I was wondering if we need to change acnp-with-log-setting to Drop as mentioned below especially when the policy rule uses the Allow action. (Quick question is it no longer the case that only the first packet of an Allow connection will be logged?)
With recent versions of Antrea (since v1.13), logging is fine with the Allow
action, so why should we change the example? especially when the policy rule uses the Allow
action is because the side effect of enabling logging with older Antrea versions is that flows are dropped after a certain rate limit. If the action is Allow
, this is clearly a bigger deal than if the action is Drop
anyway.
We only log the first packet of an Allow
connection. That has not changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanations! Now I understand, no need for difference, and If the action is Allow, this is clearly a bigger deal than if the action is Drop anyway. makes total sense.
prior to v1.13**, especially when the policy rule uses the `Allow` action. | ||
|
||
Note that v1.12 patch versions starting with v1.12.2 also do not suffer from | ||
this issue, as we backported the fix to the v1.12 release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a personal feeling, do we usually include this backport
detail in a readme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is helpful as users are more likely to find this information here, rather than look at the changelogs. I would be comfortable removing it after a few releases, but some users are running Antrea minor versions for a while, even after we stop maintaining them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Qiyue's comments also make sense to me.
Starting with Antrea v1.13, logging is best-effort, which means that logging cannot typically be used for compliance purposes. For older Antrea versions, traffic for which logging was enabled would be dropped after a certain rate was reached, creating issues for production workloads. Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
b332856
to
4bed1c1
Compare
Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
4bed1c1
to
51364ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
/skip-all |
Starting with Antrea v1.13, logging is best-effort, which means that logging cannot typically be used for compliance purposes.
For older Antrea versions, traffic for which logging was enabled would be dropped after a certain rate was reached, creating issues for production workloads.