Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Control Plane from Dataplane (aka split ingress controller and NGINX) #8034

Open
rikatz opened this issue Dec 12, 2021 · 16 comments
Open
Assignees
Labels
area/stabilization Work for increasing stabilization of the ingress-nginx codebase kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@rikatz
Copy link
Contributor

rikatz commented Dec 12, 2021

Due to recent CVEs, we started to discuss but never registered properly the need to split the ingress-controller process from NGINX process.

Here is some rationale on this:

  • NGINX process does not need access at all to Kubernetes clusters (SA, API Server, etc). as NGINX is exposed to users, it should be as protected as possible, being able to go only to configured backends and nothing else
  • Still in this scenario, NGINX should never be able to be configured to point to kubernetes api server (need to evolve this better)
  • Controller, OTOH needs access to Kubernetes API to reconcile objects
  • Once the reconciliation runs, the controller needs to be able to:
    • Write nginx template files when required (shared /etc/nginx directory from NGINX container)
    • Trigger a test into this template file to check if it is valid or not
    • Call the lua storer backend when backends/ssl/etc are changed (this is done via 127.0.0.1 today and should not be a big challenge here)
    • Start/Stop/Monitor NGINX process - This is going to be the biggest challenge

While writing the template file is just a matter of a shared volume (empty volume, maybe) between both containers, the process of starting/stoping/monitoring is going to be a bit challenging.

I will use this issue as a side note for implementations attempts, and ideas are welcome!

/kind feature
/priority important-longterm
/triage accepted

@rikatz rikatz added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 12, 2021
@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Dec 12, 2021
@rikatz
Copy link
Contributor Author

rikatz commented Dec 12, 2021

First need: write two containers manifests, one containing only ingress-controller and other containing nginx. NGINX will need some dumb-init for it, otherwise reloading/killing the process can present some challenge into the respawning monitoring.

Later: ONce all is split, need to glue together nginx with controller. The simplest approach would be to share the Process from NGINX with ingress-controlelr (but not the reverse!) and check if with small changes, controller can use NGINX Pid file (maybe shared as well?) to do the process operations

@rikatz
Copy link
Contributor Author

rikatz commented Dec 12, 2021

Other thing we need to take care off: SSLPassthrough process today runs inside controller process. With the same rationale of "control plane does not receive any users traffic" we should probably think about sending the SSLPassthrough process to the proxy container. Maybe can be the same container running nginx, and being also the container "controlling" nginx

@tao12345666333
Copy link
Member

This is valuable.

I consider whether the model can be simplified? Use NGINX as a stateless container.

If it has a problem, just restart it.

Use the control plane to write the state to the data plane.

@ElvinEfendi
Copy link
Member

@rikatz have you thought of taking a smaller step first and extracting only API interaction logic of controller first? Here's what I have in mind:

  • Extract API interaction logic into a standalone component, call it control plane. This component would do all the Kubernetes-y things the ingress-nginx controller does today. But additionally it'd also expose a GRPC stream endpoint where data plane can subscribe to and get changes on the running instance of Configuration struct.
  • Once that is extracted, the rest of the things can be referred to as ingress-nginx data plane. This component in addition to what it has, would also subscribe to the GRPC API exposed by control plane and get the latest version of running Configuration. The rest of the syncIngress logic would work the same.

@theunrealgeek
Copy link
Contributor

From what is discussed, I see the goal is to just split nginx and the controller process into 2 containers within the same pod. However from a scaling front, it makes more sense to have a few controller pods and lots of nginx pods to handle traffic.

On that front I like the gRPC idea proposed above, it creates an interface between the control plane and data plane which can, as the implementation evolves, be co-located for now but eventually start to separate out.

@rikatz
Copy link
Contributor Author

rikatz commented Dec 12, 2021

I LOVE the idea of gRPC central control plane and the dataplane subscribing it. I actually discussed this approach with James and some other folks past in time, that this was going even to be a way to make it easier to implement gateway api, for example.

I just don't know where to start, but I can try (or are you willing to look into that? 🤔 )

I was just thinking on a way to make it a simple change (like share PID, issue reloads) but maybe creating a GRPC Stream endpoint and moving all the logics below syncIngress are actually better indeed.

@rikatz
Copy link
Contributor Author

rikatz commented Dec 12, 2021

I'm wondering if we should open a branch in ingress-nginx for this and work on that.

BTW there is a previous art in kpng (https://github.com/kubernetes-sigs/kpng) doing the same thing for kube-proxy. I discussed with folks on past (👋 @jayunit100 @mcluseau ) and Jay actually asked if the whole Control Plane for GatewayAPI'ish shouldn't be like a kpng layer7 controller.

Right now, I think we can make it easier just splitting our logics the way @ElvinEfendi suggested, and the main controller of dataplane signing into the grpc endpoint from control plane but who knows the future and what lessons we can get from it :)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 12, 2022
@strongjz strongjz added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 15, 2022
@strongjz strongjz added the area/stabilization Work for increasing stabilization of the ingress-nginx codebase label Jun 23, 2022
@haslersn
Copy link
Contributor

In principle I like @ElvinEfendi's idea. It achieves the most important goal, namely the data plane has no access to the Kubernetes API and to the control plane's service account token.

Also there's no reason the data plane needs to be only an nginx process. It can readily have a managament process that does some of the things the controller does today, as long as that doesn't require Kubernetes API access.

However, if the control plane provides a gRPC API that the data plane containers/Pods access, then that might still provide a way for an adversary to corrupt the control plane. I'm not an expert at gRPC, so please prove me wrong, but I can imagine something like the following is possible:

  1. The adversary first finds an RCE vulnerability in nginx in order to take over the nginx process.
  2. From there, the adversary has access to the control plane's gRPC API.
  3. The adversary also finds an RCE vulnerability in the gRPC server in order to take over the controller process.
  4. From there, the adversary has access to the service account token and the Kubernetes API.

Maybe something like this can be prevented by using an architecture where data can only be sent from the control plane to the data plane but not vice versa.

@strongjz
Copy link
Member

/lifecycle frozen

@rikatz
Copy link
Contributor Author

rikatz commented Jul 22, 2022

About RCEs, I agree partially, but I think if we have a problem that there is an RCE in Nginx (doable, due to all the leaks and template sanitization we have) + gRPC we may have a much bigger problem.

Also, as soon as we split all of this I want to make sure control plane is distroless and just the cp binary runs there.

Finally, I was thinking also that we still need some counter measures and architecture definitions:

  • should we rewrite the informer/lister just to store secrets of type tls? (And skip any attempt on service accounts)?
  • should we plan initially to have all the pods being a controlplane/dataplane and sharing full state and then moving to an architecture where you may have X control planes and Y dataplanes? In this case, sharing the full state may be a huge performance impact

I will keep posting updates and discussions here

@strongjz
Copy link
Member

#8955

@rikatz
Copy link
Contributor Author

rikatz commented Jul 16, 2023

I am thinking again on it based on some recent discussions with @strongjz and the new sidecar container feature (which we shouldn't rely yet as a lot of users already use old versions), and based on some discussions on twitter:

  • We can create a single Pod, with 2 containers. One contains only NGINX binary. The other one contains the controller
  • We can create an emptyDir volume, and mount on both containers on /etc/nginx. The controller container should copy all the NGINX configurations (template, bootstrapped configuration, certificates, lua code) to this shared directory
  • The NGINX container should be what it is. An nginx container that starts and wait. There may be a problem here on environments that don't support sidecar lifecycle. If the configuration is not there, the nginx container will fail to start. Need to verify if the other path (nginx container adding configurations) is the desired way
  • The Pod should use shareProcessNamespace so the controller can reload and kill Nginx <- This is a stupid idea, as nginx will be able to see the other container filesystem from /proc
  • NGINX container should have a really simple localhost listener that can:
    • Start/Stop/Reload Nginx
    • Monitor the command execution back to controller
  • Points that needs attention:
    • Generating new configurations and reload
    • Getting configuration issues from the controller
    • Log (need to check again how is this done and used)

@tao12345666333
Copy link
Member

I think we can discuss again the architecture we expect.

  • CP and DP are in the same Pod. That means we only need to consider how to reduce the possibility of DP being attacked, or in other words, increase its isolation.

2023-07-17 13-50-07屏幕截图

2023-07-17 13-48-55屏幕截图

@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jul 16, 2024
@patrostkowski
Copy link

do you have any updates on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/stabilization Work for increasing stabilization of the ingress-nginx codebase kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
Status: In Progress
Development

No branches or pull requests

9 participants