Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control node traffic through network policies #4213

Closed
ColonelBundy opened this issue Sep 7, 2022 · 12 comments
Closed

Control node traffic through network policies #4213

ColonelBundy opened this issue Sep 7, 2022 · 12 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@ColonelBundy
Copy link

Describe the problem/challenge you have
When implementing a deny all rule usually you want the same policy to be enforced on the node level and not only for pods so you don't allow unnecessary traffic. A developer today would need to a network policy to allow for a particular ip or fqdn and then have to do the same on the node level for multiple nodes.

Describe the solution you'd like
I would like to be able to create network policies for nodes using ClusterNetworkPolicy with the nodeSelector that also affects the node interface. To preserve backwards combability a flag would probably be necessary to indicate if you'd like to apply the policy to the node interface as well.

Example:

apiVersion: crd.antrea.io/v1alpha1    
kind: ClusterNetworkPolicy
metadata:
  name: some-example
spec:
  priority: 1
  appliedTo:
    - nodeSelector:
      applyToNode: true <--- Flag
      matchLabels:
        node-role.kubernetes.io/control-plane: ""
    - nodeSelector:
      applyToNode: false <--- Flag
      matchLabels:
        node-role.kubernetes.io/worker: ""
  egress:
    - action: Drop
      to:
        - ipBlock:
            cidr: 1.1.1.0/24

The above example would drop egress traffic for all pods and the control-plane nodes but not the worker nodes.

Anything else you would like to add?
This feature would be a huge time saver and it's something many other CNI's support.

@ColonelBundy ColonelBundy added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 7, 2022
@tnqn
Copy link
Member

tnqn commented Sep 8, 2022

@ColonelBundy thanks for the proposal.

At the moment nodeSelector cannot be used in appliedTo and can only be used in from or to of a rule's peer. I think even when nodeSelector can be used in appliedTo, it should mean the policy applies to specific Node's interfaces literally, without the need of applyToNode: true flag.

From implementation's perspective, applying policy to Nodes' interface means antrea needs to take over Node's own traffic, especially when FQDN policy is also wanted (which means inspecting Node's own trafic is required), which antrea doesn't do today in normal mode. Antrea does support applying policies to Nodes' interfaces with a new feature called ExternalNode, but that's for non-Kubernetes Nodes. The challenge of doing it for Kubernetes Nodes is that Nodes' network interfaces will need to be bridged to OVS if extending the current network policy implementation to Node, which may impact UE a little (in worst case the Node may become unreachable if OVS exits unexpectedly).

I'm not sure how others think about the feature, but to understand the use case and help evaluate the effort, could you let us know whether it's typical each Node has multiple network interfaces you want apply networkpolicy to, or it's only single interface?

@ColonelBundy
Copy link
Author

@ColonelBundy thanks for the proposal.

At the moment nodeSelector cannot be used in appliedTo and can only be used in from or to of a rule's peer. I think even when nodeSelector can be used in appliedTo, it should mean the policy applies to specific Node's interfaces literally, without the need of applyToNode: true flag.

From implementation's perspective, applying policy to Nodes' interface means antrea needs to take over Node's own traffic, especially when FQDN policy is also wanted (which means inspecting Node's own trafic is required), which antrea doesn't do today in normal mode. Antrea does support applying policies to Nodes' interfaces with a new feature called ExternalNode, but that's for non-Kubernetes Nodes. The challenge of doing it for Kubernetes Nodes is that Nodes' network interfaces will need to be bridged to OVS if extending the current network policy implementation to Node, which may impact UE a little (in worst case the Node may become unreachable if OVS exits unexpectedly).

I'm not sure how others think about the feature, but to understand the use case and help evaluate the effort, could you let us know whether it's typical each Node has multiple network interfaces you want apply networkpolicy to, or it's only single interface?

If OVS exists and the node becomes unavailable it's not the worst thing that could happen to be honest if we think about the reason you run kubernetes in the first place, so this would be an acceptable risk for our use case.
This is also pretty easily solved using a secondary interface or not choosing to route port 22 over the bridge to OVS for disaster recovery reasons.
For our use case it's not typical that each node has multiple network interfaces but that is certainly not written in stone, if this matter for the potential implementation of this feature I would say a single interface would be enough since you could always bridge multiple physical ones to a virtual one and call it a day, that would certainly not be the case for more advanced scenarios though.

Looking forward to hearing more of your thoughts & feedback :)

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2022

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 8, 2022
@tnqn tnqn removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2023

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 9, 2023
@ColonelBundy
Copy link
Author

@tnqn Could this be considered for v2.x perhaps?

@tnqn
Copy link
Member

tnqn commented May 10, 2023

@ColonelBundy sorry for not updating it for a long time. I think it's possible to leverage some of the work we have done for Windows Node and ExternalNode feature to enforce NetworkPolicies on Linux Node interface. We may not bridge the Node interface to OVS by default to avoid influence to users who don't want it, but it could be controlled by a configuration toggle (like the existing enableBridgingMode), and applying NetworkPolicies to Nodes would require enabling that toggle. If this works for you, I could raise the issue in the next community meeting and see what others think about the feature.

@ColonelBundy
Copy link
Author

@tnqn That sounds like a good solution to me.

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 11, 2023
@rajnkamr rajnkamr added this to the Antrea v1.13 release milestone May 30, 2023
@rajnkamr
Copy link
Contributor

rajnkamr commented Aug 2, 2023

#5348

@luolanzone
Copy link
Contributor

@rajnkamr I suppose we will have no deliverables for this issue in v1.14, right?

@rajnkamr
Copy link
Contributor

@luolanzone ,this might be added in v1.15 and is now based on #5671 proposal , currently WIP cc @Atish-iaf @hongliangl

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2024
@ColonelBundy
Copy link
Author

Closed since this has been implemented in #5658 #5716 and released in https://github.com/antrea-io/antrea/releases/tag/v1.15.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants