Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: cache cargo artifacts to speed up builds #92

Closed
wants to merge 54 commits into from

Conversation

miraclx
Copy link
Contributor

@miraclx miraclx commented Apr 21, 2023

This PR is a culmination of a couple of days of research. This extends the Dockerfile, adding an intermediate stage that exports mpc-recovery's target folder, as well as the installed binaries from /usr/local/cargo the registry cache, and index, followed with other accompanying conf files from the intermediate Docker build container.

To take advantage of this process, the build process will be two-fold.

First, build the same as always:

docker build . -t near/mpc-recovery:latest

Nothing different here.. Afterward, we need to export the cargo artifact using the export-artifacts build stage.

docker build . --target export-artifacts -o - > target/docker-cache.tar

Now, let's compress that into a tgz which our Dockerfile expects (this helps get smaller cache sizes for our Docker layers)

gzip target/docker-cache.tar > target/docker-cache.tgz

Alternatively, this can be shrunk to one step..

docker build . --target export-artifacts -o - | gzip > target/docker-cache.tar.gz

# still gotta rename it (Dockerfile expects target/docker-cache.tgz)
mv target/docker-cache.tar.gz target/docker-cache.tgz

Subsequent runs of the original build step would now import the docker cache to seed the target & cargo folders, saving us the cost of updating the crates.io index, downloading deps, and recompiling them every time.

The integration-tests CI workflow is extended to merge these into the CI runtime environment.

Which, in theory, should play nicely with Swatinem/rust-cache's cleanup & cache push logic. Allowing us to recover these caches in subsequent runs of the action.


Test of efficacy: cargo runs in Docker on CI should never compile anything other than mpc-recovery if all that has changed is mpc-recovery code.

Locally, this results in build times under a minute. Especially since the export step only needs to happen occasionally.


Justfile

This patch introduces a Justfile (see Just), much like Makefile is a command runner which comes with predefined "recipes" that aid this process.

Command Description
just docker build * Builds the docker image, conditionally updating the build cache when it detects it needs to
just docker run * Run the mpc-recovery binary inside a docker container
just docker export Force an export of the cargo cache from the most-recently built docker image
just run integration-tests Run all the integration tests (but no unit tests)
just run unit-tests Run all the unit tests (but no integration tests)
just test Run all the tests (including integration tests)
just docker cache-relevance Check if the cargo build artifacts have changed since the last docker build to necessitate a cache update

*: For docker build and docker run, the docker image name can be customized with the following format:

just image=near/mpc-recovery docker build and just image=near/mpc-recovery docker run.

@miraclx
Copy link
Contributor Author

miraclx commented Apr 28, 2023

So far, this checks out. Using pre-cached target artifacts, but in CI it still redownloads crates for some reason. Building for a total of 2m19s. I'll take a deeper look to resolve that. After which we should never have to do any of the index update, crate download, or dep compilation steps if we don't need to.

@miraclx miraclx force-pushed the miraclx/docker-cargo-interop branch from f911a16 to b5d4265 Compare May 2, 2023 05:14
@volovyks
Copy link
Collaborator

@miraclx looks like this PR is a big effort. Do we still want to work on this? Or should I close it?

@miraclx
Copy link
Contributor Author

miraclx commented May 31, 2023

I put this on hold, while I've been off, fixing nearcore's publish pipeline. Finished that yesterday, and will be cutting releases for a couple of crates that depend on its primitives. I've got a couple of moves left to experiment with in this PR. And will be resuming when all that work is done.

The current iteration is "good enough" but the issue is artifact import + export takes a bit of time as well. Making the full runtime no much different than the current state on master.

I'm experimenting with ways to optimize the docker layer caching to avoid including the artifacts, since we're handling these manually.

@itegulov
Copy link
Contributor

The solution in this PR works, but makes the setup even more complicated than it is now. Mira, apprecitate what you have done here (even though you probably can't read this). Let's revisit this at a more stable time.

@itegulov itegulov closed this Jun 20, 2023
@itegulov itegulov deleted the miraclx/docker-cargo-interop branch July 20, 2023 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants