Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

entrypoint: use exec to run Device Plugin #284

Merged
merged 1 commit into from
Nov 18, 2020

Conversation

amorenoz
Copy link
Contributor

On Pod / DaemonSet termination, Kubernetes sends SIGTERM to the first
process (pid 1).

In order to ensure the SR-IOV Device Plugin daemon receives the signal
and it can gracefully clean up, use "exec" in the entrypoint script.

Signed-off-by: Adrian Moreno amorenoz@redhat.com

On Pod / DaemonSet termination, Kubernetes sends SIGTERM to the first
process on each container (pid 1).

In order to ensure the SR-IOV Device Plugin daemon receives the signal
and it can gracefully clean up, use "exec" in the entrypoint script.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Copy link
Member

@martinkennelly martinkennelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as expected. I see in sriov dp daemonset example we request host PID namespace. I want to understand if this is needed, because this patch will be meaningless with that daemon set spec however this should still be fixed.

@adrianchiris
Copy link
Contributor

@ahalim-intel @zshi-redhat any idea why hostPID would be needed ?

also i see when Openshift sriov-network-operator deploys sriov device plugin it does so without hostPID attribute
https://github.com/openshift/sriov-network-operator/blob/d0a907d29574c196a0bc0eaba3e4ded73c77ebb1/bindata/manifests/plugins/sriov-device-plugin.yaml#L25

@martinkennelly i think when hostPID is set, the POD will use the host PID namespace, so when you run exec it should still allocate a PID from the host PID namespace, when you ran locally you saw otherwise ?

@martinkennelly
Copy link
Member

@adrianchiris When hostPID is set, you share the same PID namespace as the underlying system from which the container was spawned. When I tested this, I could see all the underlying system processes. Exec allocated a new PID.
I am not sure how k8 can gracefully kill the SRIOV DP process when sharing underlying systems PID namespace. I will test this and get back to you.

@ahalimx86
Copy link
Collaborator

@ahalim-intel @zshi-redhat any idea why hostPID would be needed ?

also i see when Openshift sriov-network-operator deploys sriov device plugin it does so without hostPID attribute
https://github.com/openshift/sriov-network-operator/blob/d0a907d29574c196a0bc0eaba3e4ded73c77ebb1/bindata/manifests/plugins/sriov-device-plugin.yaml#L25

@martinkennelly i think when hostPID is set, the POD will use the host PID namespace, so when you run exec it should still allocate a PID from the host PID namespace, when you ran locally you saw otherwise ?

IIRC - the hostPID was needed to be able enumerate host devices in order to discover all the devices. Running just privileged Pod and hostNetwork was not enough. However as I can see DP is able to discover devices without it. So, most likely it's not needed. If this is confirmed then we should remove this.

We should also review other security contexts given to the DP daemonset Pod and only allow what is absolutely needed.

@martinkennelly
Copy link
Member

martinkennelly commented Nov 18, 2020

@adrianchiris It doesn't matter if hostpid is enabled or disabled, SRIOV DP receives SIGTERM signal correctly now with this patch.
Previous to this patch, SRIOV DP was not receiving SIGTERM directly.

@amorenoz
Copy link
Contributor Author

amorenoz commented Nov 18, 2020

@adrianchiris I think hostNetwork is needed to open netlink sockets, IIRC
@martinkennelly: sorry for the confusion I might have created. I would guess that the signal will be delivered to "the process started by the container", no matter what PID it has or what PID namespace it lives in. So, I believe forking in the entrypoint script will anyhow make the DevicePlugin miss the signal.
In fact, I have tested it using the example yaml and seen such behaviour.

@adrianchiris adrianchiris self-requested a review November 18, 2020 12:38
Copy link
Contributor

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the clarifications !

hostNetwork is needed to discover devices (e.g given PF Name in selector).
i think we can probably get away with removing the hostPID from the example, ill deploy without to make sure it works and post a PR on it.

anyway for this im LGTM , thanks.

@adrianchiris adrianchiris merged commit 25a64d1 into k8snetworkplumbingwg:master Nov 18, 2020
@zshi-redhat
Copy link
Collaborator

thanks for the fix!
I think that explains why it takes a long time for device plugin pod to be terminated in sriov-operator.

@amorenoz
Copy link
Contributor Author

@zshi-redhat Yes. The timeout if the pod does not die gracefully defaults to 30s!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants