Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

02.Version cache - docker cache build framework #12001

Merged
merged 1 commit into from
Dec 2, 2022

Conversation

Kalimuthu-Velappan
Copy link
Contributor

During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.

Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.

Why I did it

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@Kalimuthu-Velappan Kalimuthu-Velappan changed the title Cache infra 02.Version cache - docker cache build framework Sep 8, 2022
@Kalimuthu-Velappan
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Kalimuthu-Velappan
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Kalimuthu-Velappan
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Kalimuthu-Velappan
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@adyeung
Copy link
Collaborator

adyeung commented Sep 14, 2022

@xumia @liushilongbuaa submitter has taken the time to split the original code PR to smaller submissions for review, pls help take a look

@liushilongbuaa
Copy link
Contributor

Makefile.cache provides the same feature.

@Kalimuthu-Velappan
Copy link
Contributor Author

Makefile.cache provides the same feature.

Yes, it is an extension of the DPKG caching framework.

@Kalimuthu-Velappan
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Kalimuthu-Velappan Kalimuthu-Velappan force-pushed the CACHE_INFRA branch 2 times, most recently from c556ba9 to 789f8e2 Compare October 17, 2022 07:15
@liushilongbuaa
Copy link
Contributor

  1. I don't know what you want to copy.
  2. From pipeline result, folder target/vcache is only 3KB and cache.tgz is empty.

@Kalimuthu-Velappan
Copy link
Contributor Author

It is just a framework to copy files from the docker builder to the host. Currently, In this patch, nothing has been added to this file. This PR has only the framework part of the version cache framework.

The following patches in this series will add files to it.

#12003
#12004
#12005

and more coming on.

@liushilongbuaa
Copy link
Contributor

@Kalimuthu-Velappan
Copy link
Contributor Author

Did you try https://docs.docker.com/engine/reference/commandline/build/#custom-build-outputs?

tried the below commands, but it doesn't generate the expected out file in the host.

#cat Dockerfile
FROM debian:buster AS build-stage
RUN mkdir -p /test
RUN echo "MY CNT" > /test/myfile.txt

FROM scratch AS export-stage
COPY --from=build-stage /test /

#docker build --no-cache -o out .
#ls
Dockerfile

tar -C ${PKG_CACHE_PATH} --exclude=cache.tgz -zcvf ${PKG_CACHE_FILE_NAME} .
#set +x
if [[ ! ${IMAGENAME} =~ host-image ]]; then
sleep 1;echo -e "\n_VCSTART_"; (cat ${PKG_CACHE_FILE_NAME} | base64); echo -e "_VCEND_\n";sleep 1
Copy link
Collaborator

@xumia xumia Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use the scp command or the other RPC APIs to copy the content to the slave container? It is a little workaround to copy content by log.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@xumia
Copy link
Collaborator

xumia commented Nov 18, 2022

@Kalimuthu-Velappan , the solution is good.

Can we only use two docker build steps? Split the Dockerfile into two parts, Dockerfile1 and Dockerfile2, Dockerfile1 is to do the normal installation without cleanup, Dockerfile2 is only to do the cleanup, it can be generated from a common j2 template for all docker builds. We can add a step to copy the content from the image. It is not necessary to depend on DOCKER_BUILDKIT, and makes the copy steps more controllable not in the Dockerfile.

But I think the solution you provided is good enough.

@Kalimuthu-Velappan
Copy link
Contributor Author

Kalimuthu-Velappan commented Nov 23, 2022

@Kalimuthu-Velappan , the solution is good.

Can we only use two docker build steps? Split the Dockerfile into two parts, Dockerfile1 and Dockerfile2, Dockerfile1 is to do the normal installation without cleanup, Dockerfile2 is only to do the cleanup, it can be generated from a common j2 template for all docker builds. We can add a step to copy the content from the image. It is not necessary to depend on DOCKER_BUILDKIT, and makes the copy steps more controllable not in the Dockerfile.

But I think the solution you provided is good enough.

I split them into two docker files as you suggested. Could you just take look at the sequence and let me know your thoughts?

cat Dockerfile

# Base docker build
FROM debian:buster
RUN dd if=/dev/zero of=/root/cache.data bs=100M count=1
RUN echo $(date)

cat Dockerfile.cleanup

# Base docker build
#FROM sonic-image:latest as build
FROM sonic-image:latest

# Copy the cache data to host
From scratch as output
COPY --from=sonic-image:latest /root/cache.data cache.data

# Clean up the cache data
FROM sonic-image:latest as final
RUN rm /root/cache.data

cat build.sh

#!/bin/bash
set -x
rm -rf out; docker rmi sonic-image-tmp:latest sonic-image:latest

docker build  --no-cache --tag sonic-image:latest .;
docker tag sonic-image:latest sonic-image-tmp:latest
DOCKER_BUILDKIT=1 docker build -f Dockerfile.cleanup  --target output -o out .;
DOCKER_BUILDKIT=1 docker build -f Dockerfile.cleanup  --no-cache --target final --tag sonic-image:latest .
docker rmi sonic-image-tmp:latest

find out; docker images |grep "sonic-image.*latest"

./build.sh

  • rm -rf out
  • docker rmi sonic-image-tmp:latest sonic-image:latest
    Untagged: sonic-image:latest
    Deleted: sha256:7fbc4abec8831240c9d1c574285fdd11b5c47a758d9e957a647ec5271bc750b0
    Error: No such image: sonic-image-tmp:latest
  • docker build --no-cache --tag sonic-image:latest .
    Sending build context to Docker daemon 8.192kB
    Step 1/3 : FROM debian:buster
    ---> 4a7a1f401734
    Step 2/3 : RUN dd if=/dev/zero of=/root/cache.data bs=100M count=1
    ---> Running in 0fe5524ba638
    1+0 records in
    1+0 records out
    104857600 bytes (105 MB, 100 MiB) copied, 0.129012 s, 813 MB/s
    Removing intermediate container 0fe5524ba638
    ---> 76fea604fe86
    Step 3/3 : RUN echo $(date)
    ---> Running in 4aa14234d98e
    Wed Nov 23 13:46:58 UTC 2022
    Removing intermediate container 4aa14234d98e
    ---> 07028c1daa48
    Successfully built 07028c1daa48
    Successfully tagged sonic-image:latest
  • docker tag sonic-image:latest sonic-image-tmp:latest
  • DOCKER_BUILDKIT=1
  • docker build -f Dockerfile.cleanup --target output -o out .
    [+] Building 0.6s (5/5) FINISHED
    => [internal] load build definition from Dockerfile.cleanup 0.0s
    => => transferring dockerfile: 377B 0.0s
    => [internal] load .dockerignore 0.0s
    => => transferring context: 2B 0.0s
    => FROM docker.io/library/sonic-image:latest 0.0s
    => => resolve docker.io/library/sonic-image:latest 0.0s
    => CACHED [output 1/1] COPY --from=sonic-image:latest /root/cache.data cache.data 0.0s
    => exporting to client 0.3s
    => => copying files 104.88MB 0.3s
  • DOCKER_BUILDKIT=1
  • docker build -f Dockerfile.cleanup --no-cache --target final --tag sonic-image:latest .
    [+] Building 0.6s (6/6) FINISHED
    => [internal] load build definition from Dockerfile.cleanup 0.0s
    => => transferring dockerfile: 46B 0.0s
    => [internal] load .dockerignore 0.0s
    => => transferring context: 2B 0.0s
    => [internal] load metadata for docker.io/library/sonic-image:latest 0.0s
    => CACHED [final 1/2] FROM docker.io/library/sonic-image:latest 0.0s
    => [final 2/2] RUN rm /root/cache.data 0.6s
    => exporting to image 0.0s
    => => exporting layers 0.0s
    => => writing image sha256:b7337d033d1088956a563ea2b111f467ba015566b3afb9b07b0b5964113699d5 0.0s
    => => naming to docker.io/library/sonic-image:latest 0.0s
  • docker rmi sonic-image-tmp:latest
    Untagged: sonic-image-tmp:latest
    Deleted: sha256:07028c1daa486d4adbcd1fb0ae4ebf634f80f0fba1c204fd766d918e9ccc8a52
    Deleted: sha256:76fea604fe863f93a94fac7b608ac794598769369601995bac5d0d777d6c9e5e
  • find out
    out
    out/cache.data
  • docker images
  • grep 'sonic-image.*latest'
    sonic-image latest b7337d033d10 1 second ago 219MB

@Kalimuthu-Velappan Kalimuthu-Velappan force-pushed the CACHE_INFRA branch 2 times, most recently from 54dfff0 to f51a117 Compare November 26, 2022 17:56
@Kalimuthu-Velappan
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Comment on lines +32 to +29
DISTRO=${DISTRO} apt-get update && apt-get install -y rsync

Copy link
Collaborator

@xumia xumia Nov 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need it (line 32), we do not need to install rsync, right?

@xumia
Copy link
Collaborator

xumia commented Nov 29, 2022

@Kalimuthu-Velappan , looks good to me, only left one comment, do we need to install rsync?

@Kalimuthu-Velappan
Copy link
Contributor Author

rsync is required for go module sync and it is being used by other PRs. I will find better way to remove this and will on it.
We can keep this for now, I will address this as part of other PR.

@xumia
Copy link
Collaborator

xumia commented Dec 1, 2022

@Kalimuthu-Velappan , could you please fix the code conflicts?

During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It uses the multi-stage docker file to copy the cache
content, cleanup the temporary files and creates the final sonic docker image.
@xumia xumia merged commit aaeafa8 into sonic-net:master Dec 2, 2022
[ -d $BUILD_VERSION_PATH ] && [ ! -z "$(ls -A $BUILD_VERSION_PATH)" ] && cp -rf $BUILD_VERSION_PATH/* $POST_VERSION_PATH
rm -rf $BUILD_VERSION_PATH/*
#Save the cache file for exporting it to host.
tar -C ${PKG_CACHE_PATH} --exclude=cache.tgz -zcvf /cache.tgz .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kalimuthu-Velappan

BTW, what do you think about using pigz here?
In my tests I get a big improvements in time when use it instead of gzip:
#12825

Copy link
Contributor Author

@Kalimuthu-Velappan Kalimuthu-Velappan Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion.
As you said, I have already used this in the binary optimization PR.
#10718

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Feb 10, 2023
During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.

Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

#### How I did it

#### How to verify it
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202211: #13771

mssonicbld pushed a commit that referenced this pull request Feb 10, 2023
During docker build, host files can be passed to the docker build through
docker context files. But there is no straightforward way to transfer
the files from docker build to host.

This feature provides a tricky way to pass the cache contents from docker
build to host. It tar's the cached content and encodes them as base64 format
and passes it through a log file with a special tag as 'VCSTART and VCENT'.

Slave.mk in the host, it extracts the cache contents from the log and stores them
in the cache folder. Cache contents are encoded as base64 format for
easy passing.

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

#### How I did it

#### How to verify it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants