-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script for migrating from other CNI to Antrea #5677
Conversation
294746a
to
18d73e3
Compare
18d73e3
to
97cb804
Compare
Please update the title and comment: s/scrtip/script/ |
97cb804
to
3f8c169
Compare
Sure. I will add more descriptions for the |
fbc3a98
to
6356dbe
Compare
ccd7f28
to
b288390
Compare
|
||
if len(rule) != 0 { | ||
antreaRules = append(antreaRules, | ||
v1beta1.Rule{Action: migrate.ToPtr(v1beta1.RuleActionReject)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this, why Reject rule is added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Calico NP, if one NP is applied to some Pods, then these Pods' ingress and egress traffic will be rejected except Allow
fields in that NP, so I added a default Reject rule.
@@ -1,4 +1,4 @@ | |||
// Copyright 2021 Antrea Authors | |||
// Copyright 2023 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reformated this file accidentally, resulting in the import files being sorted. Then make code-gen
command will edit this file every time even if I undo my format operation. Maybe I could edit the year manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's kind of reformat are you referring to, but I feel it's better to avoid this change if there is no code change on the CRD.
904508a
to
370b89c
Compare
21343f2
to
706c955
Compare
docs/migrate-to-antrea.md
Outdated
|
||
The migration process is divided into three steps: | ||
|
||
1. Clean up old CNI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the old CNI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The" Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably missed this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, thanks for the reminder. Addressed.
7d5bc4e
to
266531a
Compare
build/yamls/antrea-migrator.yml
Outdated
- apiGroups: | ||
- "" | ||
resources: | ||
- pods | ||
verbs: | ||
- create | ||
- apiGroups: | ||
- "" | ||
resources: | ||
- pods/exec | ||
verbs: | ||
- create | ||
- apiGroups: | ||
- "apps/v1" | ||
resources: | ||
- daemonsets | ||
verbs: | ||
- create |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it require these permissions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed these rules.
docs/migrate-to-antrea.md
Outdated
After Antrea is installed in the cluster, the next step is to restart all | ||
Pods in the cluster in-place by the following command. This step will create | ||
a DaemonSet in the cluster, which will restart all Pods in the cluster in-place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After Antrea is up and running, you can now deploy Antrea migrator by the following command. The migrator runs as a DaemonSet, antrea-migrator
, in the cluster, which will restart all non hostNetwork Pods in the cluster in-place and perform necessary network resource cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
docs/migrate-to-antrea.md
Outdated
network management and IPAM from the old CNI. In order to avoid the Pods | ||
being rescheduled, we restart all Pods in-place by deploying a DaemonSet | ||
named `antrea-migrator`, which will run a Pod on each Node. The | ||
`antrea-migrator` Pod will stop all containerd tasks of Pods with CNI network on | ||
each Node, and the containerd tasks will be restarted by the containerd | ||
service. In this way, all Pods in the cluster will not be rescheduled but will | ||
be restarted in-place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
network management and IPAM from the old CNI. In order to avoid the Pods | |
being rescheduled, we restart all Pods in-place by deploying a DaemonSet | |
named `antrea-migrator`, which will run a Pod on each Node. The | |
`antrea-migrator` Pod will stop all containerd tasks of Pods with CNI network on | |
each Node, and the containerd tasks will be restarted by the containerd | |
service. In this way, all Pods in the cluster will not be rescheduled but will | |
be restarted in-place. | |
network management and IPAM from the old CNI. In order to avoid the Pods | |
being rescheduled and minimize service downtime, the migrator restarts | |
all non-hostNetwork Pods in-place by restarting their sandbox container. | |
Therefore, it's expected to see these Pods' `RESTARTS` being increased | |
by 1 like below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
hack/release/prepare-assets.sh
Outdated
@@ -107,6 +107,8 @@ export IMG_TAG=$VERSION | |||
export IMG_NAME=projects.registry.vmware.com/antrea/antrea-ubuntu | |||
./hack/generate-standard-manifests.sh --mode release --out "$OUTPUT_DIR" | |||
|
|||
sed "s|antrea\/antrea-migrator:latest|$(echo $IMG_NAME | sed "s/ubuntu/migrator/"):$VERSION|g" ./build/yamls/antrea-migrator.yml > "$OUTPUT_DIR"/antrea-migrator.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this line since we do not need to add it to release asserts.
docs/migrate-to-antrea.md
Outdated
a DaemonSet in the cluster, which will restart all Pods in the cluster in-place. | ||
|
||
```bash | ||
$ kubectl apply -f https://github.com/antrea-io/antrea/releases/download/v1.15.0/antrea-migrator.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ kubectl apply -f https://github.com/antrea-io/antrea/releases/download/v1.15.0/antrea-migrator.yml | |
$ kubectl apply -f https://github.com/antrea-io/antrea/main/build/yamls/antrea-migrator.yml |
Like antrea-aks-node-init.yml
, the resource is version-generic and case-specific, I think we don't need to add it to release assets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to main/build/yamls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits
@@ -0,0 +1,33 @@ | |||
# Copyright 2023 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to 2024.
docs/migrate-to-antrea.md
Outdated
|
||
After Antrea is installed in the cluster, the next step is to restart all | ||
Pods in the cluster in-place by the following command. This step will create | ||
a DaemonSet in the cluster, which will restart all Pods in the cluster in-place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe highlight antrea-migrator is supported since v1.15.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. A description of the version is added in front of this doc.
hack/release/prepare-assets.sh
Outdated
@@ -107,6 +107,8 @@ export IMG_TAG=$VERSION | |||
export IMG_NAME=projects.registry.vmware.com/antrea/antrea-ubuntu | |||
./hack/generate-standard-manifests.sh --mode release --out "$OUTPUT_DIR" | |||
|
|||
sed "s|antrea\/antrea-migrator:latest|$(echo $IMG_NAME | sed "s/ubuntu/migrator/"):$VERSION|g" ./build/yamls/antrea-migrator.yml > "$OUTPUT_DIR"/antrea-migrator.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use "projects.registry.vmware.com/antrea/antrea-migrator:$VERSION" directly to replace antrea-migrator:latest
? There is an on-going change #5794 to split agent and controller images. Maybe better to have your own image string in case there is conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this line since we do not need to add it to release asserts.
266531a
to
0fcc98b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
0fcc98b
to
0561773
Compare
docs/migrate-to-antrea.md
Outdated
migrate-example-6d6b97f96b-jpflg 1/1 Running 1 (23s ago) 2m5s 10.10.1.5 test-worker <none> <none> | ||
``` | ||
|
||
When we meet the condition that all Pods of antrea-migrator are in `Running` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
antrea-migrator -> antrea-migrator
Please change all occasions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, as well as the the other occasions.
docs/migrate-to-antrea.md
Outdated
The reason for restarting all Pods is that Antrea needs to take over the | ||
network management and IPAM from the old CNI. In order to avoid the Pods | ||
being rescheduled and minimize service downtime, the migrator restarts | ||
all non-hostNetwork Pods in-place by restarting their sandbox container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
container -> containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
docs/migrate-to-antrea.md
Outdated
migrate-example-6d6b97f96b-jpflg 1/1 Running 1 (23s ago) 2m5s 10.10.1.5 test-worker <none> <none> | ||
``` | ||
|
||
When we meet the condition that all Pods of antrea-migrator are in `Running` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"When the antrea-migrator
Pods on all Nodes are"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
docs/migrate-to-antrea.md
Outdated
|
||
When we meet the condition that all Pods of antrea-migrator are in `Running` | ||
state, the migration process is completed. You can then remove the antrea-migrator | ||
DaemonSet safely by the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by -> with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two nits.
Btw, your DCO check failed, please check if your sign-off info is missing.
9c8fd50
to
19c95cd
Compare
19c95cd
to
f5bcc9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one nit.
f5bcc9c
to
4f106a5
Compare
@jianjuns @antoninbas let me know if you will take another look, thanks. |
I have no extra comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple nits, otherwise lgtm
build/yamls/antrea-migrator.yml
Outdated
command: | ||
- "sleep" | ||
- "infinity" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's nicer to default to the pause command in your container image
the pause command handles signals well, unlike sleep. Deleting Pods where a container uses sleep infinity
can take 30s (default grace period IIRC).
see https://github.com/antrea-io/image-utils/blob/25f87067c80ab22c0b8574e1bb13cd859f669707/images/toolbox/Dockerfile#L60-L62
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/pause
is added to the image and the command of the YAML file is removed.
docs/migrate-to-antrea.md
Outdated
NOTE: The following is a reference list of CNIs and versions we have already | ||
verified the migration process. CNIs and versions that are not listed here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following is a reference list of CNIs and versions for which we have verified the migration process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
docs/migrate-to-antrea.md
Outdated
network management and IPAM from the old CNI. In order to avoid the Pods | ||
being rescheduled and minimize service downtime, the migrator restarts | ||
all non-hostNetwork Pods in-place by restarting their sandbox containers. | ||
Therefore, it's expected to see these Pods' `RESTARTS` being increased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's expected to see the RESTARTS
count for these Pods being increased
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Add a YAML file antrea-migrator.yml to migrate clusters with other CNIs to Antrea. It will restart all Pods in-place. A new image "antrea-migrator" is responsible for restarting all Pods on each Nodes. It is a DaemonSet that tries to kill sandboxes to restart the Pods in-place. Signed-off-by: hjiajing <hjiajing@vmware.com>
4f106a5
to
36689d9
Compare
/skip-all |
A new image "antrea-migrator" is responsible for restarting all Pods on each Nodes. It is a DaemonSet that tries to kill sandboxes to restart the Pods in-place. In this way, the cluster could be migrated from old CNIs to Antrea.