Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File output inside directory output fails to download from remote cache #15328

Closed
jmillikin opened this issue Apr 24, 2022 · 3 comments
Closed
Assignees
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged

Comments

@jmillikin
Copy link
Contributor

jmillikin commented Apr 24, 2022

Description of the bug:

Bazel allows an action to generate an output file and an output directory, where the file is a child of the directory. This is useful for actions where the output structure is only partially understood/deterministic.

When using such an output structure with remote execution, Bazel fails to download outputs from the remote CAS. It appears that the downloads list contains duplicate entries for nested files, which breaks the atomic file renaming in RemoteExecutionService.moveOutputsToFinalLocation().

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

# BUILD
load("//:rules.bzl", "nested_outputs")
nested_outputs(name = "nested_outputs")
# rules.bzl 
_COMMAND = """
echo 'Hello, world!' > $1/hello.txt
echo 'additional output' > $1/extra.txt
"""

def _nested_outputs(ctx):
    out_dir = ctx.actions.declare_directory(ctx.attr.name)
    out_file = ctx.actions.declare_file(ctx.attr.name + "/hello.txt")

    ctx.actions.run_shell(
        inputs = [],
        outputs = [out_dir, out_file],
        command = _COMMAND,
        arguments = [out_dir.path],
    )
    return DefaultInfo(files = depset([out_dir, out_file]))

nested_outputs = rule(implementation = _nested_outputs)
$ bazel-5.1.0 build --config=remote --verbose_failures //:nested_outputs 
INFO: Invocation ID: 379fd243-d030-4a7f-8d3e-0c4173e42877
INFO: Analyzed target //:nested_outputs (1 packages loaded, 1 target configured).
INFO: Found 1 target...
ERROR: /Users/john/src/rexec_file_and_dir/BUILD.bazel:2:15: Action nested_outputs failed: (Exit 34): /private/var/tmp/_bazel_john/ad23c0fba73bd9655514a30e2d1746b4/execroot/__main__/bazel-out/darwin-fastbuild/bin/nested_outputs/hello.txt.tmp (No such file or directory)
java.io.FileNotFoundException: /private/var/tmp/_bazel_john/ad23c0fba73bd9655514a30e2d1746b4/execroot/__main__/bazel-out/darwin-fastbuild/bin/nested_outputs/hello.txt.tmp (No such file or directory)
	at com.google.devtools.build.lib.unix.NativePosixFiles.lstat(Native Method)
	at com.google.devtools.build.lib.unix.UnixFileSystem.statInternal(UnixFileSystem.java:185)
	at com.google.devtools.build.lib.unix.UnixFileSystem.stat(UnixFileSystem.java:174)
	at com.google.devtools.build.lib.vfs.Path.stat(Path.java:319)
	at com.google.devtools.build.lib.vfs.FileSystemUtils.moveFile(FileSystemUtils.java:456)
	at com.google.devtools.build.lib.remote.RemoteExecutionService.moveOutputsToFinalLocation(RemoteExecutionService.java:741)
	at com.google.devtools.build.lib.remote.RemoteExecutionService.downloadOutputs(RemoteExecutionService.java:1096)
[...]

Which operating system are you running Bazel on?

I can reproduce the error on macOS 12.2.1 and Ubuntu 20.04

What is the output of bazel info release?

release 5.1.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

The discussion in bazelbuild/remote-apis#63 documents that output files are allowed to be the child of an output directory.

Any other information, logs, or outputs that you want to share?

No response

@brentleyjones
Copy link
Contributor

@bazelbuild/remote-execution

@coeuvre
Copy link
Member

coeuvre commented Apr 26, 2022

I cannot reproduce the exact error with your example but I do encountered another error:

java.io.IOException: Invalid action cache entry a1abdb5e59dc14a2b583ec7105c7be0a230af9a8ef63847a5466d70fe8a0be95: expected output nested_outputs/hello.txt does not exist.
        at com.google.devtools.build.lib.remote.RemoteExecutionService.downloadOutputs(RemoteExecutionService.java:956)
        at com.google.devtools.build.lib.remote.RemoteSpawnRunner.downloadAndFinalizeSpawnResult(RemoteSpawnRunner.java:366)
        at com.google.devtools.build.lib.remote.RemoteSpawnRunner.exec(RemoteSpawnRunner.java:212)
        at com.google.devtools.build.lib.exec.SpawnRunner.execAsync(SpawnRunner.java:289)
        at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:151)
        at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:111)
        at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:47)
        at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:68)
        at com.google.devtools.build.lib.analysis.actions.SpawnAction.beginExecution(SpawnAction.java:344)
        at com.google.devtools.build.lib.actions.Action.execute(Action.java:133)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$5.execute(SkyframeActionExecutor.java:918)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1085)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1043)
        at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:152)
        at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:91)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:491)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:825)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:318)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:163)
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:589)
        at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)

@coeuvre
Copy link
Member

coeuvre commented Apr 26, 2022

I guess this depends on how remote server reports the outputs. i.e. whether output file is omitted in ActionResult if it is under an output directory.

ckolli5 added a commit that referenced this issue May 10, 2022
…5444)

Adds a check to prevent creating multiple download futures for output files that are children of output directories.

Fixes #15328

Closes #15329.

PiperOrigin-RevId: 444542026

Co-authored-by: John Millikin <john@john-millikin.com>
meteorcloudy pushed a commit that referenced this issue May 10, 2022
…5444)

Adds a check to prevent creating multiple download futures for output files that are children of output directories.

Fixes #15328

Closes #15329.

PiperOrigin-RevId: 444542026

Co-authored-by: John Millikin <john@john-millikin.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@brentleyjones @jmillikin @coeuvre and others