Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge two experimental_local_disk_cache options to one #4874

Closed

Conversation

davido
Copy link
Contributor

@davido davido commented Mar 19, 2018

Fixes: #4870.

Test Plan:

$ bazel test src/test/shell/bazel:local_action_cache_test

@davido davido force-pushed the improve-experimental-local-disk-cache branch 3 times, most recently from 032c8e8 to fffe15b Compare March 20, 2018 07:21
@buchgr buchgr self-assigned this Mar 20, 2018
Copy link
Contributor

@buchgr buchgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey David,

thanks for this PR. Looks good. Just some comments.

@@ -26,6 +26,7 @@ make builds significantly faster.
* [Read only from the remote cache](#read-only-from-the-remote-cache)
* [Exclude specific targets from using the remote cache](#exclude-specific-targets-from-using-the-remote-cache)
* [Delete content from the remote cache](#delete-content-from-the-remote-cache)
* [Local action cache](#local-action-cache)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just call this feature "Disk Cache". I think "local action cache" is confusing because it's not only caching actions but also the output and I think it's easy to confuse with Bazel's actual action cache.

So what do you think about calling this "Disk Cache" and the flag --disk_cache?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what do you think about moving this after "External Links" and before "Bazel remote execution (in development)"?

@@ -27,7 +27,7 @@ function test_local_action_cache() {
local execution_file="${TEST_TMPDIR}/run.log"
local input_file="foo.in"
local output_file="bazel-genfiles/foo.txt"
local flags="--experimental_local_disk_cache_path=$cache --experimental_local_disk_cache"
local flags="--local_disk_cache=$cache"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also rename this test to disk_cache_test.sh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


@Option(
name = "experimental_local_disk_cache_path",
name = "local_disk_cache",
defaultValue = "null",
category = "remote",
documentationCategory = OptionDocumentationCategory.UNCATEGORIZED,
effectTags = {OptionEffectTag.UNKNOWN},
converter = OptionsUtils.PathFragmentConverter.class,
help = "A file path to a local disk cache."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"A path to a directory where Bazel can read and write actions and action outputs. If the directory does not exist, it will be created."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -49,7 +49,7 @@ public static SimpleBlobStore createRest(RemoteOptions options, Credentials cred
public static SimpleBlobStore createLocalDisk(RemoteOptions options, Path workingDirectory)
throws IOException {
return new OnDiskBlobStore(
workingDirectory.getRelative(checkNotNull(options.experimentalLocalDiskCachePath)));
workingDirectory.getRelative(checkNotNull(options.localDiskCache)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's a bit out of scope of this PR, but can you please also fix that Bazel will create the directory if it does not exist?

@@ -306,6 +307,31 @@ You may want to delete content from the cache to:
* Create a clean cache after a cache was poisoned
* Reduce the amount of storage used by deleting old outputs

## Local action cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disk Cache?

@@ -306,6 +307,31 @@ You may want to delete content from the cache to:
* Create a clean cache after a cache was poisoned
* Reduce the amount of storage used by deleting old outputs

## Local action cache

Bazel supports directory based action cache. It may be very useful, when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rephrased your words a bit to fit in better with the rest of the document:

Bazel can use a directory on the filesystem as a remote cache. This is
useful for sharing build artifacts when switching branches and/or working
on multiple workspaces of the same project i.e. multiple checkouts. Bazel
does not garbage collect the directory and thus one might want to set up a
cron job or similar to periodically clean up the directory. The disk cache
can be enabled with the following flags.

build --experimental_remote_spawn_cache
build --disk_cache=/path/to/disk/cache

Additionally, one can pass a user specific path to the --disk_cache flag
via the ~ character. Bazel will replace this with the current user's home
directory. This comes in handy when one wants to enable the disk cache for
all developers of a project via the project's checked in .bazelrc file.

```

To have cache hits across different workspaces, additional option should be
used to prevent Bazel to forward `PATH`, `LD_LIBRARY_PATH`, and `TMPDIR` env
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this important for different workspaces? These environment variables should be the same for one machine and user? Are you thinking about the case where multiple users use the same cache?

@davido davido force-pushed the improve-experimental-local-disk-cache branch from fffe15b to 2dd6374 Compare March 22, 2018 22:38
Fixes: bazelbuild#4870.

Test Plan:

  $ bazel test src/test/shell/bazel:build_cache_test
@davido davido force-pushed the improve-experimental-local-disk-cache branch from 2dd6374 to 10ba35f Compare March 22, 2018 22:43
Copy link
Contributor Author

@davido davido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed all your comments. I renamed the option to just build_cache but I'm, also fine with just disk_cache. I havn't created the cache directory yet. This is still to be done.


@Option(
name = "experimental_local_disk_cache_path",
name = "local_disk_cache",
defaultValue = "null",
category = "remote",
documentationCategory = OptionDocumentationCategory.UNCATEGORIZED,
effectTags = {OptionEffectTag.UNKNOWN},
converter = OptionsUtils.PathFragmentConverter.class,
help = "A file path to a local disk cache."
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -27,7 +27,7 @@ function test_local_action_cache() {
local execution_file="${TEST_TMPDIR}/run.log"
local input_file="foo.in"
local output_file="bazel-genfiles/foo.txt"
local flags="--experimental_local_disk_cache_path=$cache --experimental_local_disk_cache"
local flags="--local_disk_cache=$cache"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@davido
Copy link
Contributor Author

davido commented Mar 22, 2018

I probably shouldn't use force push. I'm a gerrit guy and don't have much experience with GH workflow. Sorry for that.

all developers of a project via the project's checked in `.bazelrc` file.

To have cache hits across different workspaces, additional option should be
used:
Copy link
Contributor Author

@davido davido Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the listing of the env. variables. The use case I'm after is: the same user (no multiple uses involved) has different clones of the same project, say:

  • /home/davido/projects/gerrit
  • /home/davido/projects/gerrit2

After building in gerrit directory, the cache is populated. When I build in gerrit2 directory, without --experimental_strict_action_env I experienced cache misses. I havn't re-checked whether this is still the case, but @ulfjack recommended to use this option in this comment: #3042 (comment) and pointed to the discussion in context of this issue: #2574.

I should have probably said: "To have cache hits across different clones of the same project..."?

@davido
Copy link
Contributor Author

davido commented Mar 23, 2018

@buchgr Just to let you know, that I'm on vacation now and only will be able to address your new comments when I'm back. This is also the reason, why I havn't adjusted this PR to check and create cache directory (but this might be better done in a new PR?).

@buchgr
Copy link
Contributor

buchgr commented Mar 23, 2018

@davido thanks! please enjoy your vacation and don't look at Github :-). I can take care of the PR at this point! Thanks!

@davido
Copy link
Contributor Author

davido commented Apr 10, 2018

@buchgr What is the status here? There are three open questions:

  1. name of the option. I changed your suggestion to build-cache. Is it OK?
  2. create directory if it doesn't exist (possibly the same PR)
  3. add a --local_disk_cache_size=SIZE option to specify an upper bound on the folder. Definitively not in this PR.

@buchgr
Copy link
Contributor

buchgr commented May 2, 2018

This has been merged 4ee7f11 Thank you @davido

@buchgr buchgr closed this May 2, 2018
openstack-gerrit pushed a commit to openstack-infra/gerrit that referenced this pull request May 30, 2018
Since [1] two action cache option were merged, and the experimental
prefix was removed. Also update the section about repository cache.
This cache is now activated per default and we can only change the
location of that cache.

[1] bazelbuild/bazel#4874

Change-Id: Iec4557a28987665182e9fe3f476d035421b751fe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants