Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate user config options after Antrea controller/agent start #868

Closed
jianjuns opened this issue Jun 25, 2020 · 5 comments
Closed

Validate user config options after Antrea controller/agent start #868

jianjuns opened this issue Jun 25, 2020 · 5 comments
Labels
enhancement New feature or request lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@jianjuns
Copy link
Contributor

Describe the problem/challenge you have
Currently Antrea does not validate all user config options (e.g. serviceCIDR, defaultMTU). Misconfiguration might lead to Antrea not working as expected.

Describe the solution you'd like
Antrea controller and/or agent should validate all config options at start, and might fail with clear error logs when options are not configured correctly that will prevent Antrea from working probably.

@jianjuns jianjuns added the enhancement New feature or request label Jun 25, 2020
@shapirov103
Copy link

shapirov103 commented Jun 25, 2020

Ideally, both validation and/or automatic CIDR discovery (fail if discovery attempt is unsuccessful) would result in the best user experience.

Discovery of CIDR maybe platform specific. For example, on EKS standard kubectl cluster-info dump does not provide the CIDR (maybe only in cases when the parameter is not set explicitly), however it is returned in response to service creation with ClusterIP set to an invalid value.

MTU discovery could also be cluster specific. Default on EKS is 9001, it would be great to update EKS instructions with that default value (based on https://docs.aws.amazon.com/eks/latest/userguide/cni-env-vars.html).

@shapirov103
Copy link

@jianjuns if you create a separate issue for automatic CIDR discovery (even if it is done similar to what I described) please let me know.

As far as the MTU parameter: may be in the documentation for EKS we should state the default is 9001 (which is inherited from EC2). If there is a way (I just used ip link on the pod) to also discover it automatically, then EKS installation is going to be super simple - just run the yaml with the deployment.

@jianjuns
Copy link
Contributor Author

jianjuns commented Jul 1, 2020

@jianjuns if you create a separate issue for automatic CIDR discovery (even if it is done similar to what I described) please let me know.

For EKS (networkPolicyOnly mode), we plan to switch to OVS kube-proxy soon, and that will remove the serviceCIDR parameter requirement.
For other cases, the problem is that we still feel no good way to discover serviceCIDR directly. @tnqn once looked into this, and can provide more comments.

As far as the MTU parameter: may be in the documentation for EKS we should state the default is 9001 (which is inherited from EC2). If there is a way (I just used ip link on the pod) to also discover it automatically, then EKS installation is going to be super simple - just run the yaml with the deployment.

For MTU, @reachjainrahul is working on MTU auto-discovery, and I think it will be available soon. We will update the documentation together with that. @reachjainrahul : I remember you have an issue for MTU discovery, but I cannot find it now?

@jianjuns
Copy link
Contributor Author

Update of the progress:
PR #909 adds support for MTU auto discovery and removes the MTU config parameter.
PR #1015 changes noEncap (used by Antrea for GKE), hybrid, and networkPolicyOnly (used in AKS and EKS) modes to use Antrea native proxy, and removes the ServiceCIDR config parameter for these modes.

But still keep the issue open until we remove or validate ServiceCIDR parameter for the encap mode.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment, or this will be closed in 180 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

2 participants