-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support specifying SecurityContext for Pods and enable tcp keepalive for AWS #915
Conversation
…elet in terraform Signed-off-by: Aylei <rayingecho@gmail.com>
- name: net.ipv4.tcp_keepalive_time | ||
value: "300" | ||
- name: net.ipv4.tcp_keepalive_intvl | ||
value: "300" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
send keepalive packet every 300s to survive the 350s fixed idle timeout of AWS NLB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
net.ipv4.tcp_keepalive_intvl
defaults to 75 seconds, do you think it's necessary to increase it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no preference actually, just want to make sure the heartbeat packet interval is less 350 despite the information from kernel compiling time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to change it. To prevent the connection from being closed by the load balancer which has shorter timeout, setting net.ipv4.tcp_keepalive_time
is enough. net.ipv4.tcp_keepalive_intvl
determines when the unresponding connection will be aborted. Increase it will increase the time the connection is kept on the server-side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! But net.ipv4.tcp_keepalive_intvl
determines the interval of subsequent probes, so it should be less than 350s too. I would like to set it to 75s explicitly (as the well-known defaults), how do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ok
@@ -359,7 +359,7 @@ func (tkmm *tikvMemberManager) getNewSetForTidbCluster(tc *v1alpha1.TidbCluster) | |||
SchedulerName: tc.Spec.SchedulerName, | |||
Affinity: tc.Spec.TiKV.Affinity, | |||
NodeSelector: tc.Spec.TiKV.NodeSelector, | |||
HostNetwork: tc.Spec.PD.HostNetwork, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a typo I suppose, is it?
@cofyc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thanks!
Co-Authored-By: weekface <weekface@gmail.com>
/run-e2e-in-kind |
Signed-off-by: Aylei <rayingecho@gmail.com>
/run-e2e-in-kind |
[ | ||
"--allowed-unsafe-sysctls=\\\"net.*\\\"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is PodSecurityPolicy required? https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/#podsecuritypolicy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, by default all the sysctls are allowed by PodSecurityPolicy
sysctls: | ||
- name: net.ipv4.tcp_keepalive_time | ||
value: "300" | ||
- name: net.ipv4.tcp_keepalive_intvl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to add "net.core.somaxconn" here? It's 128 in container now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put these as default configs in values.yaml of tidb-cluster chart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these configurations are specific to AWS.
For net.core.somaxconn
, I think it's another problem and can be addressed in an separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, what's the proper value of net.core.somaxconn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The net.core.somaxconn
is a general issue, so I think we can set this in the tidb-cluster chart values.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not possible for now, it's marked unsafe because of kernel memory accounting issue, so it must be whitelisted via kubelet flag, otherwise the pod will fail to start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I misunderstood here, it's ok to add in deploy/aws values.yaml file (what I meant is it cannot be set in charts/tidb-cluster default values.yaml file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the kernel parameters in the prerequisites document are namespaced, seems like we should configure the safe part for users by default and add document about how to configure these parameters via pod security context. Tracked in: #924
/run-e2e-in-kind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (except deploy/aws which I'm not familiar with)
@tennix @DanielZhangQD PTAL again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-e2e-in-kind |
…nd enable net.* (#954) * Support configuring sysctls for Pods and enable net.* sysctls for kubelet in terraform Signed-off-by: Aylei <rayingecho@gmail.com> * Apply suggestions from code review Co-Authored-By: weekface <weekface@gmail.com> * Address review comments Signed-off-by: Aylei <rayingecho@gmail.com>
…upstream-release-1.0
…upstream-release-1.0
…nd enable net.* (#1175) * Apply suggestions from code review Co-Authored-By: weekface <weekface@gmail.com> * Address review comments Signed-off-by: Aylei <rayingecho@gmail.com>
Signed-off-by: Aylei rayingecho@gmail.com
What problem does this PR solve?
close #880
close #795
What is changed and how does it work?
A new field
podSecurityContext
is introduced for TiKV/TiDB/PD's spec which can specify sysctls for Pods. Only the securityContext of TiDB is used now, but users can freely customize these fields as needed.In terraform, enable configuration of
net.*
sysctls in kubelet args, and set proper defaults for AWS.Check List
Tests
Tested upon AWS NLB with 350s idle timeout:
Verify the sysctls are properly set:
Code changes
Related changes
Does this PR introduce a user-facing change?: