Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xargs: stdin should not be merged before replacing #362

Closed
YDX-2147483647 opened this issue Apr 24, 2024 · 5 comments · Fixed by #363
Closed

xargs: stdin should not be merged before replacing #362

YDX-2147483647 opened this issue Apr 24, 2024 · 5 comments · Fixed by #363

Comments

@YDX-2147483647
Copy link
Contributor

YDX-2147483647 commented Apr 24, 2024

Given the following stdin,

abc
def
g

xargs -i echo [{}] should give

[abc]
[def]
[g]

instead of [abc def g].

Test using python

(Ubuntu)

>>> from subprocess import run

# xargs 0.5.0 from uutils
>>> run(['./xargs', '-i', 'echo', '[{}]'], capture_output=True, input=b"abc\ndef\ng").stdout
b'[abc def g]\n'

# xargs 4.8.0 from GNU
>>> run(['/usr/bin/xargs', '-i', 'echo', '[{}]'], capture_output=True, input=b"abc\ndef\ng").stdout
b'[abc]\n[def]\n[g]\n'
Test using shell

(MSYS2 on Windows)

# xargs 0.5.0 from uutils
$ printf 'abc\ndef\ng' | ./xargs -i echo [{}]
[abc def g]

# xargs 4.9.0 from GNU
$ printf 'abc\ndef\ng' | xargs -i echo [{}]
[abc]
[def]
[g]

Platform: unknown-linux-gnu, pc-windows-msvc, other not tested.

Relavant codes

findutils/src/xargs/mod.rs

Lines 407 to 435 in baa09ba

if let Some(replace_str) = &self.options.replace {
// we replace the first instance of the replacement string with
// the extra args, and then replace all instances of the replacement
let replacement = self
.extra_args
.iter()
.map(|s| s.to_string_lossy())
.collect::<Vec<_>>()
.join(" ");
let initial_args: Vec<OsString> = initial_args
.iter()
.map(|arg| {
let arg_str = arg.to_string_lossy();
OsString::from(arg_str.replace(replace_str, &replacement))
})
.collect();
command
.args(&initial_args)
.env_clear()
.envs(&self.options.env);
} else {
// don't do any replacement
command
.args(initial_args)
.args(&self.extra_args)
.env_clear()
.envs(&self.options.env);
};

There's no test on this behaviour.

#[test]
fn xargs_replace() {

Relates-to: #323

@YDX-2147483647 YDX-2147483647 changed the title xargs: stdin is merged before replacing xargs: stdin should not be merged before replacing Apr 24, 2024
@sylvestre
Copy link
Sponsor Contributor

I am curious why you are sharing a python example ? :)

would you like to try to provide a fix?
thanks

@YDX-2147483647
Copy link
Contributor Author

why you are sharing a python example

I thought it was caused by CR/LF or shells, so I want to feed bytes to the executable directly. AFAIK, python is the easiest way.

would you like to try to provide a fix?

I'd like to do so! But I cannot guarantee when, maybe in a month?

@YDX-2147483647
Copy link
Contributor Author

YDX-2147483647 commented Apr 25, 2024

I've inspected it deeper. The issue is harder than I thought. The command is always executed once in current implementation, but it needs to be executed multiple times if -i/--replace or -I is given.

Example:

# xargs (GNU findutils) 4.8.0
$ printf "abc\ndef\ng" | xargs python3 -c 'from sys import argv; print(argv)'
['-c', 'abc', 'def', 'g']
$ printf "abc\ndef\ng" | xargs -i python3 -c 'from sys import argv; print(argv, repr("""{}"""))'
['-c'] 'abc'
['-c'] 'def'
['-c'] 'g'

The following is to be appended to tests/xargs_tests.rs.

    // Multiple lines
    Command::cargo_bin("xargs")
        .expect("found binary")
        .args(["-i=_", "echo", "[_]"])
        .write_stdin("abc\ndef\ng")
        .assert()
        .success()
        .stderr(predicate::str::is_empty())
        .stdout(predicate::str::diff("[abc]\n[def]\n[g]\n"));

    Command::cargo_bin("xargs")
        .expect("found binary")
        .args(["-i", "echo", "{} {} foo"])
        .write_stdin("bar\nbaz")
        .assert()
        .stdout(predicate::str::diff("bar bar foo\nbaz baz foo"));

@YDX-2147483647
Copy link
Contributor Author

YDX-2147483647 commented Apr 28, 2024

Workaround

Pass -n 1 or --max-args 1. (and -d \n if you want)

$ printf 'abc\ndef\ng' | xargs -I _ -n 1 echo [_]
[abc]
[def]
[g]

Conflicting xargs options (GNU Findutils 4.9.0):

The options ‘--max-lines’ (‘-L’, ‘-l’), ‘--replace’ (‘-I’, ‘-i’) and ‘--max-args’ (‘-n’) are mutually exclusive.

The exception to this rule is that the special max-args value 1 is ignored after the ‘--replace’ option and its short-option aliases ‘-I’ and ‘-i’, because it would not actually conflict.

@YDX-2147483647
Copy link
Contributor Author

YDX-2147483647 commented Apr 28, 2024

Hi! I've fixed the issue in normal cases, but if conflict options are given, things get complicated. (I need to change 141 lines)

Currently, we throw a warning and take the last option.

findutils/src/xargs/mod.rs

Lines 930 to 933 in baa09ba

eprintln!(
"WARNING: Both --{} and -L were given; last option will be used",
options::MAX_ARGS,
);

Should I follow this behaviour? Or may I use Arg::conflicts_with in clap and throw an error?

YDX-2147483647 added a commit to YDX-2147483647/findutils that referenced this issue May 23, 2024
sylvestre pushed a commit to YDX-2147483647/findutils that referenced this issue Jun 1, 2024
sylvestre pushed a commit to YDX-2147483647/findutils that referenced this issue Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants