-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*_binary args get splitted #12313
Comments
extra info $ bazel build :main
$ bazel run --shell_executable=$(pwd)/bazel-bin/main :main2 -- "d e"
1: -c
2: SOME_PATH/execroot/__main__/bazel-out/k8-fastbuild/bin/main2 a b c 'd e' also
|
Looks like the issue is |
Not sure who owns RunfilesSupport, but this looks like a real logic bug. |
To my reading, https://docs.bazel.build/versions/master/be/common-definitions.html#binary.args specifically says that |
May be just a documentation issue then
while
Intuitively, when you supply array of strings, you don''t expect split and combine the lists behavior, also sh-tokenization is said to apply to string arguments, not to lists of string. |
If I recall correctly, this has come up a few times before. I agree that the behavior should be changed to avoid shell tokenization; however, this is a difficult transition. It's not enough to have a flag because there's no way to write args in a way that is compatible with and without shell tokenization. |
One way to migrate is
args = ["--foo"] + tokenize(something) + ["abc"]
# or
args = ["--foo"] + tokenize(something1, something2) + ["abc"] which would work regardless of flag given that there are no splits in free args |
We also probably need builtin quote(string_with_spaces_or_quotes): string -> string args = ["--foo", "abc", quote(" x "), quote("'"), quote("\'x\'")] + tokenize("a \"b b\"") + tokenize("c d", "e f")
# meaning
argv = ["--foo", "abc", " x ", "'", "\'x\'", "a", "b b", "c", "d", "e", "f"] and rejected arguments would be args = ["a b", " x ", "\"x\"", "\'y\'", "\"x y\"", "a \"b\" c" ] + tokenize("a \" b") Note that currently unquoting seems to do funny and unpredictable things args = ["a", " \"b c\" ", " \"\"b\" c\" ", "'a ' ", "'d'", "\"e\"", "\"\"x\"\"", "\"'y'\"", "''\"''\"\"\"z\"\"\"''\"''", "''zz''", "'''xx'''"] results in
|
I'll do some thinking about this. To me, a list of args should be based 1:1 as the arg list of the called process. The whole concept of shell tokanization is wrong. My hope is that we can devise a mechanism so that binaries can indicate that they parse args correctly, and thus no tokenization is needed. What you list in args is what you get. |
@dmivankov, if I understand correctly, you're suggesting to add a flag that changes how Bazel parses args and simultaneously changes how You didn't say that second part explicitly. If we don't change how I think the most straightforward way to do this is to add a tag |
tokenize-args or even just argv=[..] that is never tokenized are also valid options. With tokenize"e we'd either need them to produce a tagged string/list of strings that indicates if it is wrapped(& into which of them) or not, or indeed make them behave differently depending on the flag. Technically we could also
|
As another data point, this is also present in the It was added ostensibly for
|
@aiuto how should we categorize this issue? |
I don't know. |
Executable Starlark rules can use the new `arguments` parameter on `RunEnvironmentInfo` to specify the arguments that Bazel should pass on the command line with `test` or `run`. If set to a non-`None` value, this parameter overrides the value of the `args` attribute that is implicitly defined for all rules. This allows Starlark rules to implement their own version of this attribute which isn't bound to its proprietary processing (data label expansion and tokenization). Along the way, this commit adds test coverage and documentation for the interplay between `RunEnvironmentInfo`'s `environment` and `--test_env`. The value of the `arguments` field of `RunEnvironmentInfo` is intentionally not exposed to Starlark yet: It is not clear how these arguments should be represented and whether rules relying on the magic `args` attribute should also provide this field. Fixes bazelbuild#16076 Work towards bazelbuild#12313
Executable Starlark rules can use the new `arguments` parameter on `RunEnvironmentInfo` to specify the arguments that Bazel should pass on the command line with `test` or `run`. If set to a non-`None` value, this parameter overrides the value of the `args` attribute that is implicitly defined for all rules. This allows Starlark rules to implement their own version of this attribute which isn't bound to its proprietary processing (data label expansion and tokenization). Along the way, this commit adds test coverage and documentation for the interplay between `RunEnvironmentInfo`'s `environment` and `--test_env`. The value of the `arguments` field of `RunEnvironmentInfo` is intentionally not exposed to Starlark yet: It is not clear how these arguments should be represented and whether rules relying on the magic `args` attribute should also provide this field. Fixes bazelbuild#16076 Work towards bazelbuild#12313
Executable Starlark rules can use the new `arguments` parameter on `RunEnvironmentInfo` to specify the arguments that Bazel should pass on the command line with `test` or `run`. If set to a non-`None` value, this parameter overrides the value of the `args` attribute that is implicitly defined for all rules. This allows Starlark rules to implement their own version of this attribute which isn't bound to its proprietary processing (data label expansion and tokenization). Along the way, this commit adds test coverage and documentation for the interplay between `RunEnvironmentInfo`'s `environment` and `--test_env`. The value of the `arguments` field of `RunEnvironmentInfo` is intentionally not exposed to Starlark yet: It is not clear how these arguments should be represented and whether rules relying on the magic `args` attribute should also provide this field. RELNOTES: Executable starlark rules can use the `arguments` parameter of `RunEnvironmentInfo` to specify their command-line arguments with `bazel run` and `bazel test`. Fixes bazelbuild#16076 Work towards bazelbuild#12313
I think the default implementation should be the obvious one. Parameters in the list should be passed to the binary without any changes. Doing this might break others who had to workaround this. So anybody fixing the implementation should also put it behind an incompatible flag. |
+1. But the obvious thing is that args should be passed as is. Going back to the examples, main2 should print
|
What do you think should be the behavior of |
I think all args in a list should be passed as single strings and never retokenized by blaze. So, if there are paths in |
@aiuto I agree, but what about For unrelated reasons, I have thought about rewriting the |
TBH. I'm not wrapping my head around that use case right now. That is, I don't have a mental model of when to use |
The primary use case appears to be complex scripts in genrules, e.g. bazel/scripts/packages/debian/BUILD Lines 141 to 151 in df42879
Ideally these would be converted to proper binaries with command-line arguments and custom actions using e.g. |
Based on the discussion so far and ignoring
|
Maybe in addition to the Or maybe |
I recently wrote a starlark rule that treats args without context: pytest which requires a "main" wrapper calling pytest.main()
Caveat: I'm a starlark noob, so I'm sure that there are better ways (i.e. with an Args object) to do this. Yes, this means that some our_py_test(args=) usage looks funny, but it's correct if you think of python subprocess.Popen([cmd]) usage. i.e.: our_py_test(args=["-k=a and b"]) Add in the complication that not all cli parsers support (single char flags with equals (i.e. -k=). Preferentially one should switch to the --flag= format which typically does. otherwise you're left with ["-k", "a and b"] which is slightly harder to read, and people not used to strict interpretations of args will be confused:
|
Description of the problem / feature request:
Arguments set by args attribute of *_binary get splitted
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
BUILD:
main.c:
Testing:
$ bazel run :main -- a "b c" 1: a 2: b c $ bazel run :main2 1: a 2: b 3: c $ bazel run :main3 1: a 2: b c $ bazel run :main4 1: a 2: b c
What operating system are you running Bazel on?
Linux/NixOS
What's the output of
bazel info release
?release 3.3.1- (@non-git)
If
bazel info release
returns "development version" or "(@non-git)", tell us how you built Bazel.Comes from NixOS packages
nix-env -iA nixpkgs.bazel
Any other information, logs, or outputs that you want to share?
Luckily
treats ">/dev/null" as string rather than redirection. Could've been an injection vulnerability otherwise
Also tested with recent git version 6bf44ba
The text was updated successfully, but these errors were encountered: