Response traffic from allowed egress denied on short lived pods #189

luk2038649 · 2024-01-23T21:13:04Z

What happened:
Recently switched from using calico tigera operator for networkpolicies, to enabling networkpolicy handling by aws-vpc-cni.

We are finding intermittent errors with connections hanging for applications which are short lived, and reach out to external services like databases immediately. Noted primarily in cronjobs and airflow pods.

We experienced this same issue reaching out to external google services, and also AWS Aurora instances in a paired VPC.

Our networkpolicy is setup with an explicit Egress allow all. And a more restrictive ingress policy.

spec:
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
  ingress:
    - from:
        - podSelector: {}
    - from:
        - namespaceSelector:
            matchLabels:
              toolkit.fluxcd.io/owner: redacted
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: redacted

Picture of policy:

Its my understanding that we should be allowed to receive response traffic from anywhere based on the documentation here

Example Pod Logs:
Requests that are normally completed quickly hang and never finish.

$ kubectl -n redactedNS logs redactedTask
INFO       2024-01-23 16:44:26 redacted.db read_from_views                      298 : Querying view redacted_data for 2024-01-23 16:57:04

Note that if you exec into this pod and run the same command some minutes startup, it will complete. It only fails to complete right after startup.

Attach logs
VPC-CNI logs.
Instance of return traffic from a google service being denied

"level":"info","ts":"2024-01-23T15:49:26.013Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"172.253.63.95","Src Port":443,"Dest IP":"x.x.x.x(peered vpc IP)","Dest Port":59486,"Proto":"TCP","Verdict":"DENY"}

instance of return traffic from an aurora instance in a peered VPC being denied.

{"level":"info","ts":"2024-01-23T17:50:34.327Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"x.x.x.x(peered VPC IP)","Src Port":5432,"Dest IP":"x.x.x.x(Pod IP),"Dest Port":36806,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2024-01-23T17:52:37.207Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"x.x.x.x(peered VPC IP)","Src Port":5432,"Dest IP":"x.x.x.x(Pod IP)","Dest Port":36806,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2024-01-23T17:54:40.087Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"x.x.x.x(peered VPC IP)","Src Port":5432,"Dest IP":"x.x.x.x(Pod IP),"Dest Port":36806,"Proto":"TCP","Verdict":"DENY"}

What you expected to happen:
Response traffic should not be denied if egress was allowed.

How to reproduce it (as minimally and precisely as possible):

enable all egress in networkpolicy
disable ingress in networkpolicy
create a cronjob which immediately reaches out to an external service.

Anything else we need to know?:
We did not experience this same situation when using calico tigera operator to handle the same networkpolicy.

to be clear, calico has been completely removed and all nodes have been restarted.

Seems to be possibly the same as #83

We have found workarounds by doing two main things.

Explicitly allowing ingress from the CIDR block of the peered VPC where the DB lives.
Sleeping jobs/pods for 5s before making connections.

Environment:

Kubernetes version (use kubectl version): Server Version: version.Info{Major:"1", Minor:"26+", GitVersion:"v1.26.12-eks-5e0fdde", GitCommit:"95c835ee1111774fe5e8b327187034d8136720a0", GitTreeState:"clean", BuildDate:"2024-01-02T20:34:50Z", GoVersion:"go1.20.12", Compiler:"gc", Platform:"linux/amd64"}
CNI Version: v1.16.0-eksbuild.1
Network Policy Agent Version: aws-network-policy-agent:v1.0.7-eksbuild.1
OS (e.g: cat /etc/os-release): NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"
Kernel (e.g. uname -a): Linux ip-x-x-x-x.ec2.internal 5.10.201-191.748.amzn2.x86_64 Network Policy Agent - Initial commit #1 SMP Mon Nov 27 18:28:14 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

The text was updated successfully, but these errors were encountered:

brettstewart · 2024-01-23T21:14:37Z

+1

achevuru · 2024-01-24T07:57:39Z

@luk2038649 Based on the issue description, it is expected behavior in a way. Right now, all traffic will be allowed to/from a Pod until the newly launched pods are reconciled against configured network policies on the cluster. It can take up to few seconds for the reconciliation to complete and policies are enforced against a new pod.

We track reverse flows (response traffic) via our own internal conntrack. In the above example, when we initiate a connection to AWS Aurora, the traffic should be allowed (as egress is configured to be allow-all) and once the probe allows the traffic it creates a conntrack entry which the ingress probe will then rely on to allow the return traffic. However, I think the egress connection in this case happens right after pod startup and before the policies are enforced (i.e.,) before the relevant eBPF probes are attached and so the required conntrack entry is not created. But the probes are attached with relevant rules before the return traffic arrives at the pod - so we will not have a match in the conntrack entry and the ingress rules in the configured policy do not allow traffic from this endpoint resulting in a drop. Explains why introducing a few seconds delay resolved the issue (explicitly adding the rules under ingress section will also work in the above race condition). To get around this we might need a few seconds delay at pod startup before it initiates a connection (or) a retry for a failed connection should help as well.

We plan to introduce a Strict mode option in the near future which will gate pod launch until either relevant policies are configured against a new pod replica (or) block all ingress/egress connections until the policies are reconciled against a new pod.

luk2038649 · 2024-01-24T15:29:36Z

@achevuru Thanks for the quick response!

Is there any approximate timeline for the general release of the "strict mode" option? Is there an issue or PR we can track?

explicitly adding the rules under ingress section will also work in the above race condition

This is what we have done for known hosts like databases, but we have a lot of applications and opening up traffic for all possible responses is not ideal

ariary · 2024-01-26T10:14:51Z

Hi all!
We've noticed same behaviour for pods to pods traffic: an allowed traffic connection (by netpol) is denied by netpol at startup but then allowed (Once the newly launched pods are reconciled against configured network policies on the cluster)

Right now, all traffic will be allowed to/from a Pod until the newly launched pods are reconciled against configured network policies on the cluster. It can take up to few seconds for the reconciliation to complete and policies are enforced against a new pod.

@achevuru I think this is the opposite. All Traffic is denied from a newly created pod until the newly launched pods are reconciled against configured network policies on the cluster

What I was describing is in fact:

Allow rules will be applied eventually after the isolation rules (or may be applied at the same time). In the worst case, a newly created pod may have no network connectivity at all when it is first started, if isolation rules were already applied, but no allow rules were applied yet.

cf netpol doc/pod lifecycle

achevuru · 2024-01-30T15:52:07Z

@luk2038649 We're targeting it for early Q2/late Q1 release time frame. Will update once we're closer to the release.

allamand · 2024-02-16T07:52:09Z

Can it be possible to delay the readiness of the pod until all the netpol have been correctly applied ? Something like the podreadinessgate used with load balancer integration ?

ariary · 2024-04-02T14:50:07Z

@achevuru The Strict mode option has not solved the issue as it seems that what the issue is describing is especially the standard option of the Strict Mode which is (still) blocking some traffic.

Right now, all traffic will be allowed to/from a Pod until the newly launched pods are reconciled against configured network policies on the cluster. It can take up to few seconds for the reconciliation to complete and policies are enforced against a new pod

This statement is not true therefore

achevuru · 2024-04-02T15:33:59Z

@ariary Can you expand on what was not solved with Strict mode? What exactly did you try with Strict mode?

Regarding Standard mode, the above statement is true (i.e.,) the pods will not have any firewall rules enforced until the new pod is reconciled against active policies and so all traffic is allowed. However, once the firewall rules take effect, it will block any return traffic that isn't tracked by the probes. Please refer here. Strict mode should address this.

ariary · 2024-04-04T08:50:16Z

@achevuru My issue is more related to other issues, you can ignore my comment

Pavani-Panakanti · 2024-10-02T22:52:51Z

@luk2038649 Did the issue resolve for you with the strict mode ?

dmarkhas · 2024-11-29T16:15:12Z

@luk2038649 Did the issue resolve for you with the strict mode ?

Why would strict mode help? It simply blocks all outbound traffic from the pod until the policy reconciles (thus dropping 100% of the outbound traffic), instead of randomly dropping connections.

Pavani-Panakanti · 2024-12-07T01:00:39Z

We are actively working on a fix for this issue. Fix for this issue can be tracked here #345.

Closing this issue. Please follow above issue for the fix

luk2038649 added the bug Something isn't working label Jan 23, 2024

luk2038649 changed the title ~~Response traffic from allowed egress denied on short lives pods~~ Response traffic from allowed egress denied on short lived pods Jan 23, 2024

jayanthvn mentioned this issue Feb 5, 2024

Race condition causes quickly opened connections to fail #186

Closed

jdn5126 added the strict mode Issues blocked on strict mode implementation label Feb 16, 2024

younsl mentioned this issue Apr 11, 2024

Intermittent connection reset and delay running time #245

Closed

This was referenced May 9, 2024

Network policy blocks established connections to RDS #236

Closed

Network policy blocks established connections to STS. #73

Closed

achevuru mentioned this issue Jun 3, 2024

Network Policy Not Enforced on Initial Creation #271

Open

Pavani-Panakanti closed this as completed Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Response traffic from allowed egress denied on short lived pods #189

Response traffic from allowed egress denied on short lived pods #189

luk2038649 commented Jan 23, 2024

brettstewart commented Jan 23, 2024

achevuru commented Jan 24, 2024

luk2038649 commented Jan 24, 2024

ariary commented Jan 26, 2024 •

edited

Loading

achevuru commented Jan 30, 2024

allamand commented Feb 16, 2024

ariary commented Apr 2, 2024

achevuru commented Apr 2, 2024

ariary commented Apr 4, 2024

Pavani-Panakanti commented Oct 2, 2024

dmarkhas commented Nov 29, 2024

Pavani-Panakanti commented Dec 7, 2024

Response traffic from allowed egress denied on short lived pods #189

Response traffic from allowed egress denied on short lived pods #189

Comments

luk2038649 commented Jan 23, 2024

brettstewart commented Jan 23, 2024

achevuru commented Jan 24, 2024

luk2038649 commented Jan 24, 2024

ariary commented Jan 26, 2024 • edited Loading

achevuru commented Jan 30, 2024

allamand commented Feb 16, 2024

ariary commented Apr 2, 2024

achevuru commented Apr 2, 2024

ariary commented Apr 4, 2024

Pavani-Panakanti commented Oct 2, 2024

dmarkhas commented Nov 29, 2024

Pavani-Panakanti commented Dec 7, 2024

ariary commented Jan 26, 2024 •

edited

Loading