Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specifying SecurityContext for Pods and enable tcp keepalive for AWS #915

Merged
merged 6 commits into from
Sep 18, 2019

Conversation

aylei
Copy link
Contributor

@aylei aylei commented Sep 15, 2019

Signed-off-by: Aylei rayingecho@gmail.com

What problem does this PR solve?

close #880
close #795

What is changed and how does it work?

A new field podSecurityContext is introduced for TiKV/TiDB/PD's spec which can specify sysctls for Pods. Only the securityContext of TiDB is used now, but users can freely customize these fields as needed.
In terraform, enable configuration of net.* sysctls in kubelet args, and set proper defaults for AWS.

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Tested upon AWS NLB with 350s idle timeout:

$ mysql -h <elb-host> -P 4000 -u root
MySQL [(none)]> select sleep(360); select tidb_version();
+------------+
| sleep(360) |
+------------+
|          0 |
+------------+
1 row in set (6 min 0.00 sec)

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version()                                                                                                                                                                                                                                                                                              |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v3.0.1
Git Commit Hash: 9e4e8da3c58c65123db5f26409759fe1847529f8
Git Branch: HEAD
UTC Build Time: 2019-07-16 01:03:40
GoVersion: go version go1.12 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Verify the sysctls are properly set:

kubectl -n my-cluster exec -it my-cluster-tidb-1 -c tidb -- sh
/ # sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 300
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 300

Code changes

  • Has Helm charts change
  • Has Go code change

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Does this PR introduce a user-facing change?:

Support specifying SecurityContext for PD, TiKV and TiDB Pods and enable tcp keepalive for AWS.

…elet in terraform

Signed-off-by: Aylei <rayingecho@gmail.com>
- name: net.ipv4.tcp_keepalive_time
value: "300"
- name: net.ipv4.tcp_keepalive_intvl
value: "300"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

send keepalive packet every 300s to survive the 350s fixed idle timeout of AWS NLB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

net.ipv4.tcp_keepalive_intvl defaults to 75 seconds, do you think it's necessary to increase it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no preference actually, just want to make sure the heartbeat packet interval is less 350 despite the information from kernel compiling time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to change it. To prevent the connection from being closed by the load balancer which has shorter timeout, setting net.ipv4.tcp_keepalive_time is enough. net.ipv4.tcp_keepalive_intvl determines when the unresponding connection will be aborted. Increase it will increase the time the connection is kept on the server-side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! But net.ipv4.tcp_keepalive_intvl determines the interval of subsequent probes, so it should be less than 350s too. I would like to set it to 75s explicitly (as the well-known defaults), how do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok

@@ -359,7 +359,7 @@ func (tkmm *tikvMemberManager) getNewSetForTidbCluster(tc *v1alpha1.TidbCluster)
SchedulerName: tc.Spec.SchedulerName,
Affinity: tc.Spec.TiKV.Affinity,
NodeSelector: tc.Spec.TiKV.NodeSelector,
HostNetwork: tc.Spec.PD.HostNetwork,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a typo I suppose, is it?
@cofyc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks!

charts/tidb-cluster/values.yaml Outdated Show resolved Hide resolved
charts/tidb-cluster/values.yaml Outdated Show resolved Hide resolved
aylei and others added 2 commits September 16, 2019 11:24
@aylei
Copy link
Contributor Author

aylei commented Sep 16, 2019

/run-e2e-in-kind

Signed-off-by: Aylei <rayingecho@gmail.com>
@aylei
Copy link
Contributor Author

aylei commented Sep 16, 2019

/run-e2e-in-kind

@aylei aylei requested review from weekface and cofyc September 16, 2019 08:55
[
"--allowed-unsafe-sysctls=\\\"net.*\\\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, by default all the sysctls are allowed by PodSecurityPolicy

sysctls:
- name: net.ipv4.tcp_keepalive_time
value: "300"
- name: net.ipv4.tcp_keepalive_intvl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to add "net.core.somaxconn" here? It's 128 in container now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put these as default configs in values.yaml of tidb-cluster chart?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these configurations are specific to AWS.

For net.core.somaxconn, I think it's another problem and can be addressed in an separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, what's the proper value of net.core.somaxconn?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The net.core.somaxconn is a general issue, so I think we can set this in the tidb-cluster chart values.yaml

Copy link
Contributor

@cofyc cofyc Sep 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not possible for now, it's marked unsafe because of kernel memory accounting issue, so it must be whitelisted via kubelet flag, otherwise the pod will fail to start

Copy link
Contributor

@cofyc cofyc Sep 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I misunderstood here, it's ok to add in deploy/aws values.yaml file (what I meant is it cannot be set in charts/tidb-cluster default values.yaml file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the kernel parameters in the prerequisites document are namespaced, seems like we should configure the safe part for users by default and add document about how to configure these parameters via pod security context. Tracked in: #924

deploy/modules/aws/tidb-cluster/local.tf Show resolved Hide resolved
charts/tidb-cluster/values.yaml Show resolved Hide resolved
@aylei
Copy link
Contributor Author

aylei commented Sep 17, 2019

/run-e2e-in-kind

Copy link
Contributor

@cofyc cofyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (except deploy/aws which I'm not familiar with)

@aylei
Copy link
Contributor Author

aylei commented Sep 17, 2019

@tennix @DanielZhangQD PTAL again

Copy link
Member

@tennix tennix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@DanielZhangQD DanielZhangQD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aylei
Copy link
Contributor Author

aylei commented Sep 18, 2019

/run-e2e-in-kind

@aylei aylei merged commit 1423093 into pingcap:master Sep 18, 2019
aylei added a commit that referenced this pull request Sep 25, 2019
…nd enable net.* (#954)

* Support configuring sysctls for Pods and enable net.* sysctls for kubelet in terraform

Signed-off-by: Aylei <rayingecho@gmail.com>

* Apply suggestions from code review

Co-Authored-By: weekface <weekface@gmail.com>

* Address review comments

Signed-off-by: Aylei <rayingecho@gmail.com>
aylei added a commit to aylei/tidb-operator that referenced this pull request Nov 18, 2019
aylei added a commit to aylei/tidb-operator that referenced this pull request Nov 18, 2019
aylei added a commit that referenced this pull request Nov 19, 2019
…nd enable net.* (#1175)

* Apply suggestions from code review

Co-Authored-By: weekface <weekface@gmail.com>

* Address review comments

Signed-off-by: Aylei <rayingecho@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants