-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Third party container management using the Sonic Application Framework #1286
Conversation
sg893052
commented
Mar 15, 2023
•
edited
Loading
edited
Repo | PR title | State |
---|---|---|
sonic-utilities | TPCM support in Sonic Package Manager | |
sonic-buildimage | Additional changes for TPCM support |
681d1be
to
e8caba9
Compare
Reviewed in community meeting. Recording https://zoom.us/rec/share/EXGfLxRpPT6buxU2we1OAKFqj-Nsq6vdQywuABzI7JhxNRcK-O_DSXY6xC8xVRUh.h2xG7KEZxjHlmV87 |
Microsoft and Nvidia will be the reviewer for this feature. |
@zhangyanzhao Please add me to the review. |
@zhangyanzhao can you add @stepanblyschak as a reviewer? it should not be me :) |
- Dynamically install TPC from docker registry and docker image from the local file system, SCP, SFTP, URL. These remote download options are introduced now to SONiC AEF. | ||
- Upgrade TPCs from docker registry and docker image from local file system, SCP, SFTP, URL | ||
- Provide default manifest file for docker image without manifest during install. | ||
- Provide runtime install capability to pass various docker startup arguments and parameters through local custom manifest file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have the capability to run the script inside the TPC? e.g We may need to mount the specific config file changes in the host and use it inside TPC or for some dependencies I may need to execute some script before starting services inside the container, do we have an option to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes ,capability to run the script inside TPC is supported
SONiC Application Extension Infrastructure provides the framework to integrate SONiC compatible dockers. However there are many open source third party applications which can be installed on SONiC system, to extend the capabilities, and these typically are standalone or have less interaction with the SONiC system itself. So it is not necesary to pre define these docker applications in SONiC build, and provide their corresponding manifest.json. These TPCs can be installed dynamically on a SONiC device and can be managed. | ||
|
||
- This feature enables the installation and management of third-party Docker packages without manifest files through the Sonic Package Manager. | ||
- Also, it manages the TPC without affecting the sonic system resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the connectivity for the TPC to the outside world? hope we dont expose the entire host networking to the TPC. What if my requirement is to run the TPC on the mgmt VRF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TPC with mgmt VRF support can be provided later but currently we don't support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not captured in the HLD, please add it in the restrictions/limitations section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included in the Limitations section
- The TPCM support feature will leverage the existing Sonic Package Manager framework. | ||
- Ability to download image tarballs for the Docker packages through SCP, SFTP, and URL before installing them is introduced. | ||
- If a manifest file is not found in the docker image during installation, a default local TPC manifest file is used to install the package. | ||
- A Docker image that does not contain a manifest file will be considered a Third-Party Container (TPC). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should not differentiate like this, wherever possible we should design generically, in this case, App ext can be extended to support docker image without manifest file as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
"show": "", | ||
"clear": "" | ||
}, | ||
"tpcm": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we need this tpcm flag? why cant have the generic design?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can have a flag as "type" instead of "tpcm"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cant we skip type field altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skipped as suggested
###### Examples | ||
|
||
``` | ||
admin@sonic:~$ sudo sonic-package-manager manifests create my_node_exporter --tpcm --container memory 50m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we make sure that TPC is not hijacking any of the control package meant for processing by SONiC system? do we have any validation in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TPC docker wont run in "privileged" mode and runs with the default security settings and won't have escalated privileges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can skip '--tpcm' option in the command, would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skipped as suggested
- Specify system resource limits for TPCs to restrict CPU, Memory usage | ||
- Provide update capability to update various TPC configurations like their memory, cpu, dependencies etc. This capability is new to SONiC AEF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think various system resource limits for container shall be part of the manifest. The manifest, once associated with a docker container, will be the same for all devices, however, some devices have more powerful hardware and these limits may not be relevant for them. Instead, resources limits configuration is up to the user using a particular device. This configuration can be done e.g. through FEATURE table and for all containers, not just TPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TPC overall memory limit can be restricted to 20% of system memory ( say, 3087MB from 15G system ram).
By default, if individual TPC memory limit is not specified, 20% of TPC overall memory limit can be assigned ( say, 617MB from 3087MB in a 15G system ram).
Idea is that summation of the memory limits configured for all TPCs must not go beyond the overall TPC memory limit set.
This way the resource limits gets adjusted as per the underlying hardware.
Agreed that configuration could be placed in config db.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Could you please update the document to make memory/cpu limits configuration part of config db?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.
|
||
### High-Level Design | ||
|
||
SONiC Application Extension Infrastructure provides the framework to integrate SONiC compatible dockers. However there are many open source third party applications which can be installed on SONiC system, to extend the capabilities, and these typically are standalone or have less interaction with the SONiC system itself. So it is not necesary to pre define these docker applications in SONiC build, and provide their corresponding manifest.json. These TPCs can be installed dynamically on a SONiC device and can be managed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it is not necesary to pre define these docker applications in SONiC build,
This is not true for regular application extensions. An application extension is not required to be defined at SONiC build.
FYI, dhpc-relay and macsec are not the only application extensions we have. There are some which are not even in SONiC source tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extended version of sonic extension is the TPCM where it need not comply with sonic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 So I just suggest to change the wording:
So it is not necesary to pre define these docker applications in SONiC build
to
So it is not necesary to for these docker applications to be SONiC compliant
since the first one is not true for regular (non-TPC) dockers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken care
- A Docker image that does not contain a manifest file will be considered a Third-Party Container (TPC). | ||
- At the end of the TPC installation using sonic-package-manager install, if no manifest file is found in the image and the --name option is used, a new manifest file with the specified name is created. | ||
- The user can also create a custom local manifest file to be used during installation by specifying the "--use-local-manifest" option along with a custom name using the "--name" option in "sonic-package-manager install" command. | ||
- The custom local manifest file can be created using "sonic-package-manager manifests create" command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a manifest
CLI tree? You could just pass in a JSON file to install
command and then updating manifest is just an upgrade with a different manifest file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating manifest file for each TPC docker is tedious task and customizing parameter is also difficult for a user.
With the manifest cli tree, it can be easily created, viewed. End user need not worry about TPC container creation.
"before": [] | ||
}, | ||
"syslog": { | ||
"support-rate-limit": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it can be set to true for TPC. support-rate-limit
requires having supervisor managed rsyslogd inside container. This is very specific to SONiC containers. It should be set to false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Ok, could you please update the doc to set it to false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken care
"memory": "512M", | ||
"cpu-quota": "20000", | ||
"cpu-period": "100000", | ||
"command": "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the command
? How do you translate it for docker create
command? Is it --entrypoint
? If so, I suggest to rename this field to entrypoint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"command" attribute refers to the container startup arguments. As suggested, the attribute name could be updated accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Ok, btw you may want to update manifest reference to include new attributes - https://github.com/sonic-net/SONiC/blob/82ca34c8548a065b5274ba9fdee6df93037215ea/doc/sonic-application-extension/sonic-application-extension-guide.md#manifest-reference
|
||
- The local manifest file can be updated using "sonic-package-manager manifests update" command | ||
|
||
- The 'command' attribute is used to update the docker startup command arguments. For example, setting "command":"--path.rootfs=/host" would configure the host filesystem as the root filesystem path for the container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand how the command to run inside container mounts the root filesystem to container. Please clarify this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just an example, by specifying "command":"--path.rootfs=/host", you would configure the container to use the host filesystem as the root filesystem path. This adjustment can impact the way the container interacts with and accesses files and directories during its runtime.
As agreed, this field will be updated to "entrypoint"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Got it, just an example
admin@sonic:~$ sudo sonic-package-manager install --from-tarball-url https://tpc.local-server.com/home/tpcs/httpd.tar.gz --name=my_url_httpd | ||
``` | ||
|
||
Install from scp: | ||
``` | ||
admin@sonic:~$ sudo sonic-package-manager install --from-tarball-scp /home/tpcs/httpd.tar.gz --host 10.171.112.156 --username user1 --password pass123 --name my_scp_httpd | ||
``` | ||
|
||
Install from sftp: | ||
``` | ||
admin@sonic:~$ sudo sonic-package-manager install --from-tarball-sftp /home/tpcs/httpd.tar.gz --host 10.171.112.156 --username user1 --password pass123 --name my_sftp_httpd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not deducing transport protocol from URL and having just --from-tarball? E.g curl
can work with http(s),scp,sfpt deduced from input URL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Could you please update the doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken care
###### Examples | ||
|
||
``` | ||
admin@sonic:~$ sudo sonic-package-manager manifests create my_node_exporter --tpcm --container memory 50m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ensure that for non TPC containers, users cannot modify manifest. I don't see why it might be useful for the user. Container configuration like resource limits needs to be outside the manifest as I suggested in previous comment. Manifest is an immutable property of an image, chaning it might break the extension or the system and it would be hard to create any validation mechanism for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will ensure this is allowed only for TPC containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could have an option to indicate whether manifest file update allowed or not during image creation, by default, we cant modify the manifest file but when we create the image for TPC container, we could set it to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Please update the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.
|
||
### Requirements | ||
|
||
These open TPCs help extend SONiC capabilities, and thus the following requirements are outlined to integrate them seamlessly into SONiC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per discussion online, could you please summarize what are the benefits of preparing docker image on the box instead of preparing image outside the box?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TPCM is the extended version of AEF(application extension framework) which will create manifest file and associate with TPC and manipulate the container resources through manifest file.
To summarize the benefits of preparing docker image on the box instead of preparing image outside the box.
- Leveraging existing TPC Docker images that lack a manifest file.
- Eliminating the need for manual steps to modify Docker images, such as unpacking, appending the manifest file, and repackaging.
- Facilitating seamless upgrade scenarios, bypassing the necessity of repackaging Docker images with the manifest file. Moreover, the updated manifest file, with resource adjustments, can be used for impending Docker image upgrades.
- At a per-device level, permitting users to supply manifest files for TPCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 Thanks, I see one advantage is that the user does not need to have a registry to place his re-built standard docker images and can pull them from a well known existing docker registries.
@stepanblyschak @lguohan @venkatmahalingam @stephenxs @adyeung @Kalimuthu-Velappan @babukr Could you please help review the answers for the comments provided. Please let us know if we could have a meeting to go over. |
- CPU | ||
- System Memory | ||
|
||
These limits can be configured during the TPC installation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, resource limits are good options which need not be restricted only for TPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sg893052 @venkatmahalingam Agree, shouldn't be restricted for TPC. In fact, resource limits were discussed to be added to regular app.ext when it first was introduced but this functionality was delayed since we had no use for configuring limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.
--from-tarball-sftp Install using the tarball from SFTP source | ||
--from-tarball-url Install using the tarball from HTTPS source | ||
--use-local-manifest Use locally created manifest file | ||
--name custom name for tpcm package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, please avoid tpcm name usage in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
Uninstallation test case: | ||
- Verify that TPCM packages can be uninstalled using the sonic package manager without any issues. | ||
|
||
Upgrade test case: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is SONiC-2-SONiC upgrade considered? What will happen with TPC extension if we upgrade base SONiC image to new version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the TC
@sg893052 The doc looks good to me, please make sure new manifest attributes are included in - https://github.com/sonic-net/SONiC/blob/82ca34c8548a065b5274ba9fdee6df93037215ea/doc/sonic-application-extension/sonic-application-extension-guide.md#manifest-reference. |
@stepanblyschak pls review and signoff if you are satisfied with the change |
moved to 202405 release |
All prior review comments addressed. If there are no further comments, this will go merge by end of the week. |