Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Third party container management using the Sonic Application Framework #1286

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

sg893052
Copy link
Contributor

@sg893052 sg893052 commented Mar 15, 2023

Repo PR title State
sonic-utilities TPCM support in Sonic Package Manager
sonic-buildimage Additional changes for TPCM support

@sg893052 sg893052 force-pushed the master branch 3 times, most recently from 681d1be to e8caba9 Compare May 9, 2023 10:15
@zhangyanzhao
Copy link
Collaborator

@zhangyanzhao
Copy link
Collaborator

@zhangyanzhao
Copy link
Collaborator

Microsoft and Nvidia will be the reviewer for this feature.

@prvattem
Copy link
Collaborator

prvattem commented May 9, 2023

@zhangyanzhao Please add me to the review.
Thanks
Phaniraj

@stephenxs
Copy link
Collaborator

@zhangyanzhao can you add @stepanblyschak as a reviewer? it should not be me :)

- Dynamically install TPC from docker registry and docker image from the local file system, SCP, SFTP, URL. These remote download options are introduced now to SONiC AEF.
- Upgrade TPCs from docker registry and docker image from local file system, SCP, SFTP, URL
- Provide default manifest file for docker image without manifest during install.
- Provide runtime install capability to pass various docker startup arguments and parameters through local custom manifest file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the capability to run the script inside the TPC? e.g We may need to mount the specific config file changes in the host and use it inside TPC or for some dependencies I may need to execute some script before starting services inside the container, do we have an option to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes ,capability to run the script inside TPC is supported

SONiC Application Extension Infrastructure provides the framework to integrate SONiC compatible dockers. However there are many open source third party applications which can be installed on SONiC system, to extend the capabilities, and these typically are standalone or have less interaction with the SONiC system itself. So it is not necesary to pre define these docker applications in SONiC build, and provide their corresponding manifest.json. These TPCs can be installed dynamically on a SONiC device and can be managed.

- This feature enables the installation and management of third-party Docker packages without manifest files through the Sonic Package Manager.
- Also, it manages the TPC without affecting the sonic system resources.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the connectivity for the TPC to the outside world? hope we dont expose the entire host networking to the TPC. What if my requirement is to run the TPC on the mgmt VRF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC with mgmt VRF support can be provided later but currently we don't support it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is not captured in the HLD, please add it in the restrictions/limitations section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included in the Limitations section

- The TPCM support feature will leverage the existing Sonic Package Manager framework.
- Ability to download image tarballs for the Docker packages through SCP, SFTP, and URL before installing them is introduced.
- If a manifest file is not found in the docker image during installation, a default local TPC manifest file is used to install the package.
- A Docker image that does not contain a manifest file will be considered a Third-Party Container (TPC).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should not differentiate like this, wherever possible we should design generically, in this case, App ext can be extended to support docker image without manifest file as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

"show": "",
"clear": ""
},
"tpcm": true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we need this tpcm flag? why cant have the generic design?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can have a flag as "type" instead of "tpcm"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant we skip type field altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped as suggested

###### Examples

```
admin@sonic:~$ sudo sonic-package-manager manifests create my_node_exporter --tpcm --container memory 50m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we make sure that TPC is not hijacking any of the control package meant for processing by SONiC system? do we have any validation in place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC docker wont run in "privileged" mode and runs with the default security settings and won't have escalated privileges.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can skip '--tpcm' option in the command, would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped as suggested

Comment on lines +86 to +88
- Specify system resource limits for TPCs to restrict CPU, Memory usage
- Provide update capability to update various TPC configurations like their memory, cpu, dependencies etc. This capability is new to SONiC AEF.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think various system resource limits for container shall be part of the manifest. The manifest, once associated with a docker container, will be the same for all devices, however, some devices have more powerful hardware and these limits may not be relevant for them. Instead, resources limits configuration is up to the user using a particular device. This configuration can be done e.g. through FEATURE table and for all containers, not just TPC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC overall memory limit can be restricted to 20% of system memory ( say, 3087MB from 15G system ram).
By default, if individual TPC memory limit is not specified, 20% of TPC overall memory limit can be assigned ( say, 617MB from 3087MB in a 15G system ram).
Idea is that summation of the memory limits configured for all TPCs must not go beyond the overall TPC memory limit set.
This way the resource limits gets adjusted as per the underlying hardware.

Agreed that configuration could be placed in config db.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 Could you please update the document to make memory/cpu limits configuration part of config db?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.


### High-Level Design

SONiC Application Extension Infrastructure provides the framework to integrate SONiC compatible dockers. However there are many open source third party applications which can be installed on SONiC system, to extend the capabilities, and these typically are standalone or have less interaction with the SONiC system itself. So it is not necesary to pre define these docker applications in SONiC build, and provide their corresponding manifest.json. These TPCs can be installed dynamically on a SONiC device and can be managed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is not necesary to pre define these docker applications in SONiC build,

This is not true for regular application extensions. An application extension is not required to be defined at SONiC build.
FYI, dhpc-relay and macsec are not the only application extensions we have. There are some which are not even in SONiC source tree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extended version of sonic extension is the TPCM where it need not comply with sonic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 So I just suggest to change the wording:

So it is not necesary to pre define these docker applications in SONiC build

to

So it is not necesary to for these docker applications to be SONiC compliant

since the first one is not true for regular (non-TPC) dockers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken care

- A Docker image that does not contain a manifest file will be considered a Third-Party Container (TPC).
- At the end of the TPC installation using sonic-package-manager install, if no manifest file is found in the image and the --name option is used, a new manifest file with the specified name is created.
- The user can also create a custom local manifest file to be used during installation by specifying the "--use-local-manifest" option along with a custom name using the "--name" option in "sonic-package-manager install" command.
- The custom local manifest file can be created using "sonic-package-manager manifests create" command
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a manifest CLI tree? You could just pass in a JSON file to install command and then updating manifest is just an upgrade with a different manifest file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating manifest file for each TPC docker is tedious task and customizing parameter is also difficult for a user.
With the manifest cli tree, it can be easily created, viewed. End user need not worry about TPC container creation.

"before": []
},
"syslog": {
"support-rate-limit": true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it can be set to true for TPC. support-rate-limit requires having supervisor managed rsyslogd inside container. This is very specific to SONiC containers. It should be set to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 Ok, could you please update the doc to set it to false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken care

"memory": "512M",
"cpu-quota": "20000",
"cpu-period": "100000",
"command": ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the command? How do you translate it for docker create command? Is it --entrypoint? If so, I suggest to rename this field to entrypoint.

Copy link
Contributor Author

@sg893052 sg893052 May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"command" attribute refers to the container startup arguments. As suggested, the attribute name could be updated accordingly.

Copy link
Contributor

@stepanblyschak stepanblyschak Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


- The local manifest file can be updated using "sonic-package-manager manifests update" command

- The 'command' attribute is used to update the docker startup command arguments. For example, setting "command":"--path.rootfs=/host" would configure the host filesystem as the root filesystem path for the container
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand how the command to run inside container mounts the root filesystem to container. Please clarify this example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just an example, by specifying "command":"--path.rootfs=/host", you would configure the container to use the host filesystem as the root filesystem path. This adjustment can impact the way the container interacts with and accesses files and directories during its runtime.
As agreed, this field will be updated to "entrypoint"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 Got it, just an example

Comment on lines 272 to 282
admin@sonic:~$ sudo sonic-package-manager install --from-tarball-url https://tpc.local-server.com/home/tpcs/httpd.tar.gz --name=my_url_httpd
```

Install from scp:
```
admin@sonic:~$ sudo sonic-package-manager install --from-tarball-scp /home/tpcs/httpd.tar.gz --host 10.171.112.156 --username user1 --password pass123 --name my_scp_httpd
```

Install from sftp:
```
admin@sonic:~$ sudo sonic-package-manager install --from-tarball-sftp /home/tpcs/httpd.tar.gz --host 10.171.112.156 --username user1 --password pass123 --name my_sftp_httpd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not deducing transport protocol from URL and having just --from-tarball? E.g curl can work with http(s),scp,sfpt deduced from input URL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 Could you please update the doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken care

###### Examples

```
admin@sonic:~$ sudo sonic-package-manager manifests create my_node_exporter --tpcm --container memory 50m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ensure that for non TPC containers, users cannot modify manifest. I don't see why it might be useful for the user. Container configuration like resource limits needs to be outside the manifest as I suggested in previous comment. Manifest is an immutable property of an image, chaning it might break the extension or the system and it would be hard to create any validation mechanism for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will ensure this is allowed only for TPC containers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could have an option to indicate whether manifest file update allowed or not during image creation, by default, we cant modify the manifest file but when we create the image for TPC container, we could set it to true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 Please update the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.


### Requirements

These open TPCs help extend SONiC capabilities, and thus the following requirements are outlined to integrate them seamlessly into SONiC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per discussion online, could you please summarize what are the benefits of preparing docker image on the box instead of preparing image outside the box?

Copy link
Contributor Author

@sg893052 sg893052 May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPCM is the extended version of AEF(application extension framework) which will create manifest file and associate with TPC and manipulate the container resources through manifest file.

To summarize the benefits of preparing docker image on the box instead of preparing image outside the box.

  1. Leveraging existing TPC Docker images that lack a manifest file.
  2. Eliminating the need for manual steps to modify Docker images, such as unpacking, appending the manifest file, and repackaging.
  3. Facilitating seamless upgrade scenarios, bypassing the necessity of repackaging Docker images with the manifest file. Moreover, the updated manifest file, with resource adjustments, can be used for impending Docker image upgrades.
  4. At a per-device level, permitting users to supply manifest files for TPCs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 Thanks, I see one advantage is that the user does not need to have a registry to place his re-built standard docker images and can pull them from a well known existing docker registries.

@sg893052
Copy link
Contributor Author

@stepanblyschak @lguohan @venkatmahalingam @stephenxs @adyeung @Kalimuthu-Velappan @babukr Could you please help review the answers for the comments provided. Please let us know if we could have a meeting to go over.

- CPU
- System Memory

These limits can be configured during the TPC installation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, resource limits are good options which need not be restricted only for TPC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sg893052 @venkatmahalingam Agree, shouldn't be restricted for TPC. In fact, resource limits were discussed to be added to regular app.ext when it first was introduced but this functionality was delayed since we had no use for configuring limits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.

--from-tarball-sftp Install using the tarball from SFTP source
--from-tarball-url Install using the tarball from HTTPS source
--use-local-manifest Use locally created manifest file
--name custom name for tpcm package
Copy link
Collaborator

@venkatmahalingam venkatmahalingam Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, please avoid tpcm name usage in the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Uninstallation test case:
- Verify that TPCM packages can be uninstalled using the sonic package manager without any issues.

Upgrade test case:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SONiC-2-SONiC upgrade considered? What will happen with TPC extension if we upgrade base SONiC image to new version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the TC

@stepanblyschak
Copy link
Contributor

@adyeung
Copy link
Collaborator

adyeung commented Dec 7, 2023

@stepanblyschak pls review and signoff if you are satisfied with the change

@zhangyanzhao zhangyanzhao requested review from prvattem and stepanblyschak and removed request for stephenxs December 12, 2023 16:39
@zhangyanzhao
Copy link
Collaborator

moved to 202405 release

@adyeung
Copy link
Collaborator

adyeung commented Jan 18, 2024

All prior review comments addressed. If there are no further comments, this will go merge by end of the week.

@adyeung adyeung merged commit 50a62e0 into sonic-net:master Jan 18, 2024
@adyeung adyeung self-assigned this Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants