Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find the program using PATHEXT #37381

Closed

Conversation

afiune
Copy link

@afiune afiune commented Oct 24, 2016

Windows relies on path extensions to resolve commands, extensions
are found in the PATHEXT environment variable. We are adding a new
function called Command:find_program() that will return the Path of
the program found using the above env variable.

Closes #37380

Signed-off-by: Salim Afiune afiune@chef.io

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @aturon (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

Windows relies on path extensions to resolve commands, extensions
are found in the `PATHEXT` environment variable. We are adding a new
function called `Command:find_program()` that will return the `Path` of
the program found using the above env variable.

Closes rust-lang#37380

Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune afiune force-pushed the afiune/37380/find_program_using_pathext branch from 3608d00 to 40ed572 Compare October 25, 2016 01:51
@afiune afiune changed the title WIP: Find the program using PATHEXT Find the program using PATHEXT Oct 25, 2016
@afiune afiune changed the title Find the program using PATHEXT Find the program using PATHEXT Oct 25, 2016
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating! Learn something new every day...

if let Some(exts) = self.env.get("PATHEXT") {
for ext in split_paths(&exts) {
let ext_str = pathext.to_str().unwrap().trim_matches('.');
let path = path.join(self.program.to_str().unwrap())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think you just need to call with_extension as otherwise this is joining self.program onto the root path twice I think.

} else {
// Windows relies on path extensions to resolve commands.
// Path extensions are found in the PATHEXT environment variable.
if let Some(exts) = self.env.get("PATHEXT") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this lookup could be hoisted out?

@afiune
Copy link
Author

afiune commented Oct 25, 2016

@alexcrichton Thank you so much for your feedback! I will have plenty of time (on an airplane) tomorrow morning so I'll look at this and do some modifications. Also I will add some tests! 😄

Salim Afiune added 2 commits October 25, 2016 08:25
This new simple fun will lookup for a specific variable within
`self.env`

Signed-off-by: Salim Afiune <afiune@chef.io>
Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune
Copy link
Author

afiune commented Oct 26, 2016

@alexcrichton I have added the fn env_lookup() as you recommended, what do you think? Additionally I can add tests for fn find_program() but I might need to add/mock some dummy files; Is there a place I can add fixtures to the repo? I was planning to add tests/fixtures/bin but again just double checking first. 😄 ( I don't want to add them if this is not allowed )

} else {
// Windows relies on path extensions to resolve commands.
// Path extensions are found in the PATHEXT environment variable.
if let Some(exts) = env_lookup(self.env.as_ref(), "PATHEXT") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh here self.env.as_ref() is already bound above as env (e.g. we know it's Some already). You can change this to just env.get, but I was thinking that we'd just want to hoist this out of the loop here so it's only executed once instead of once-per-path.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im with you, didn't get it the first time. Changed already! Thanks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will delete the env_lookup() fn though..

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also yeah for adding a test the best place to add that would likely be src/test/run-pass and then add a new folder in there with a .rs file that's the test and a .foo file fixture which you can read as well.

// Path extensions are found in the PATHEXT environment variable.
if let Some(exts) = env_lookup(self.env.as_ref(), "PATHEXT") {
for ext in split_paths(&exts) {
let ext_str = pathext.to_str().unwrap().trim_matches('.');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid an unwrap() here and just skip anything that can't be turned into a string

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! What about using to_string_lossy() instead? Something like this:

                            for ext in split_paths(&exts) {
                                let ext_str = ext.to_string_lossy();
                                let path = path.with_extension(
                                                ext_str.trim_matches('.')
                                );
                                if fs::metadata(&path).is_ok() {
                                    return Some(path.into_os_string())
                                }
                            }

Instead of the `env_lookup()` fn we should just use the `HashMap::get()`

Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune
Copy link
Author

afiune commented Oct 26, 2016

Im working on the tests @alexcrichton .. Thank you for all your feedback. Let me know if you have more comments on the code. 👍

@alexcrichton
Copy link
Member

Looks good to me! Feel free to just ping me when a test is added and I'll r+

let ext_str = ext.to_string_lossy();
let path = path.with_extension(
ext_str.trim_matches('.')
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could fit on oneline?

// Path extensions are found in the PATHEXT environment variable.
if let Some(exts) = env_pathext {
for ext in split_paths(&exts) {
let ext_str = ext.to_string_lossy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to call to_str here to not misinterpret OS strings (e.g. lossy may change the meaning), but you can just skip anything that's None. You can probably do:

for ext in split_paths(&ext).filter_map(|e| e.to_str()) {
    // ...
}

@@ -141,12 +138,34 @@ impl Command {
.with_extension(env::consts::EXE_EXTENSION);
if fs::metadata(&path).is_ok() {
return Some(path.into_os_string())
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typically try to avoid rightward drift where possible, so because of the return above you can omit this else, de-indenting what's below. You can also change what's below to:

let exts = match env_pathext {
    Some(e) => e,
    None => continue,
};
// ...

to get rid of another layer of indentation.

@ollie27
Copy link
Member

ollie27 commented Oct 26, 2016

I don't think this is a good idea. Command::spawn uses CreateProcess which only supports exe files so in order to do this we'd need to find another way to launch processes (for some reason .bat files do appear to work but I don't think it's behaviour we can rely on).

Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune
Copy link
Author

afiune commented Oct 26, 2016

@ollie27 Oh Yeah!! 👍 Batch files MUST use cmd. Thank you for pointing that out! It is still viable to do it though, I will code that as well.

Signed-off-by: Salim Afiune <afiune@chef.io>
fn command_on_path_found() {
let c = command_with_pathext("bin");
let bat = canonicalize("./src/test/run-pass/process_command/fixtures/bin/bin.bat");
assert_eq!(bat.ok(), c.find_program());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm unfortunately the find_program function here is a private implementation detail of Command so this function I don't think will pass on Windows :(

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton I made it to be public, just as Command.cmd() and Command.env().. I wonder if it should be private as you mentioned and instead test within the Module? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the public function you added is actually an internal implementation detail that's wrapped with a std::process::Command, so it won't be accessible here.

But yeah moving the test to that module would be ok.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Im working on it

If the program is a BATCH file CreateProcess requires to start the
command interpreter by setting ApplicationName to `cmd.exe` and
CommandLine to the following arguments: `/c` plus the name of the
batch file.

Signed-off-by: Salim Afiune <afiune@chef.io>
@alexcrichton
Copy link
Member

@ollie27 interesting! @afiune the only reason we have this probing logic right now is to read the child's PATH instead of the parent's if applicable. If in Windows, however, you CreateProcess("foo"), for example, does that work if something like foo.cmd exists? That is, is this expected behavior on Windows?

Salim Afiune added 2 commits November 1, 2016 11:05
We need to move this tests to live inside the module since the
`find_program` fn is a private implementation.

Signed-off-by: Salim Afiune <afiune@chef.io>
Simple fn that will convert a `OsString` into a `Vec<u16>`.
Very useful when you need to interact with WindowsAPI.

Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune
Copy link
Author

afiune commented Nov 1, 2016

@alexcrichton I don't think is that simple... In Windows it is very important the extension and that is why CreateProcess only accepts exe files inside the CommandLine.

All this information is here: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682425(v=vs.85).aspx

As it states:

To run a batch file, you must start the command interpreter; set lpApplicationName to cmd.exe and set lpCommandLine to the following arguments: /c plus the name of the batch file.

So you can run other things that are not EXE by supplying an interpreter (ApplicationName). In this case for bat and cmd files you can use cmd.exe (which is the part I coded yesterday).

Updating the tests to match the modifications made.

The `make_command_line` fn now returns two variables, the
application_name and the command_line. It will detect when
the program is a batch script or not and use the `cmd.exe`
as the interpreter.

Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune afiune force-pushed the afiune/37380/find_program_using_pathext branch from 3ca9a91 to e9e960e Compare November 1, 2016 15:29
@ollie27
Copy link
Member

ollie27 commented Nov 1, 2016

If in Windows, however, you CreateProcess("foo"), for example, does that work if something like foo.cmd exists? That is, is this expected behavior on Windows?

No, CreateProcess will only add .exe if there isn't already an extension passed.

As @alexcrichton mentioned, the code for finding the binary that's modified in this PR is part of a hack to read the child's %Path% and is not the main way in which Command::spawn finds binaries. That logic is part of CreateProcess. In particular it is only triggered if the child's environment has been customised which I doubt is what we want. There are numerous issues with the existing code which this PR inherits and I've listed them in a separate issue (#37519).

%PATHEXT% contains more than just .bat and .exe so I still don't think it makes sense for the Rust std to use it.

@afiune
Copy link
Author

afiune commented Nov 1, 2016

@alexcrichton @ollie27 that makes sense. I didn't know about that child proc information/details.

I have a couple of ideas:

  • What do you think about moving that logic out of find_program and put it inside make_command_line instead? At the end of the day we can implement this and make Command run more than just EXE files by passing the ApplicationName to the CreateProcess module. I see a lot of value on that! 😄
  • Another option is to use the find_program fn but not look at the Command::env but instead the actual env::var_os.

I get the point that ollie27 says about PATHEXT containing more that just exe & bat extensions, but in my humble opinion I see value in not just running EXE files in Rust std, I think anyone in the Windows world would love to be able to run their BAT and CMD scripts. (Batch Files)

@ollie27
Copy link
Member

ollie27 commented Nov 2, 2016

I'm not sure it's worth the added complexity to do anything special for batch files as it's not unreasonable to have to use Command::new("cmd.exe").arg("/C").arg("foo.bat").spawn().

@afiune
Copy link
Author

afiune commented Nov 2, 2016

🤔 I might disagree with that; It is like if we would want to execute Unix scripts like Command::new("/usr/sh").arg("-x").arg("foo.sh").spawn(). 😕

In a Windows terminal if you have a foo.bat script in your %PATH% you can directly execute it like foo without having to pass the extention. I think many applications in Windows has .bat and .cmd scripts that we just run without knowing the extension, I think it is very valuable to be able to do the same in Rust: Command::new("foo").spawn() 🎉 It should work! 😸

@retep998
Copy link
Member

retep998 commented Nov 2, 2016

I personally believe that Command::spawn should mirror CreateProcess as closely as possible, and not try to provide any sort of special custom behavior. It's not Rust's job to magically make Unix and Windows behave the same, and trying to do so will just result in all sorts of broken edge cases and surprising behavior.

The reason that running a foo.bat in your terminal works is because it is not using CreateProcess but rather ShellExecute. This also means you can "run" stuff such as foo.txt and it'll open a text editor with that file in it. I think there would definitely be value in providing a wrapper around ShellExecute but that should not be Command::spawn.

@alexcrichton
Copy link
Member

Ah so yes if CreateProcess doesn't read or look at PATHEXT and can't spawn .bat files directly, then I agree that we shouldn't be modifying this as it's intended to mirror CreateProcess.

@afiune does passing cmd and /C work for you use case?

@afiune
Copy link
Author

afiune commented Nov 2, 2016

@alexcrichton If there are strong feeling about not adding some extra functionality I think it would be ok for me to write a find_program fn and leverage that... Something like this:

Command.new(find_program("git")?).spawn()

Where find_program will actually search the program by using the PATH and PATHEXT and returning the full path of the program

I think my point was to make it easier for users to be able to use the Command module. I'm all about automation and make things easier.

In other news, I discovered that you can actually do:

Command.new("C:/program/bin.bat").spawn()

And it works! So CreateProcess can actually run BATCH scripts without the prefix cmd /c 😄 - Maybe we can add the simple function that I suggest to find the right executable and pass it through?

@alexcrichton
Copy link
Member

Basically the intention here is to mirror the semantics of CreateProcess while looking up in the child's PATH instead of the parent's PATH. @afiune in that sense if you call Command::new("bin") with C:/program in your PATH, C:\program\bin.bat existing, and bat in the PATHEXT variable, does that spawn correctly? That should bypass this finding logic and see what CreateProcess does IIRC.

If that works then it would disagree with this comment indicating that CreateProcess works with more than just exe files.

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Nov 10, 2016
@alexcrichton
Copy link
Member

ping @afiune, any updates here?

@afiune
Copy link
Author

afiune commented Nov 29, 2016

@alexcrichton Thank you for following up!! 😄

The scenario you put above does not work. Some facts are:

  • CreateProcess doesn't look for bat files since it doesn't use the PATHEXT variable.
  • The PATHEXT variable is used by cmd.exe to determine which extensions to look for and in what order.
  • By default CreateProcess will append the extension .exe if the file name does not contain an extension (but only if the file name does not contains a path).
  • If the file name has .bat or .cmd but it does not contain a directory path, the system searches for the executable file in the current directory or the PATH variable.

That means that this examples will work with the current implementation:

// Passing the file name with extension will search for it in the PATH
Command.new("bin.bat")

// Passing the full path will not search, obviously :)
Command.new("C:\program\bin.bat")

What doesn't work because CreateProcess doesn't look at PATHEXT is:

// Even though the `bin.bat` is inside one of the dirs defined in PATH, we wont find it.
Command.new("bin")

Long story short, we can execute .bat and .cmd but we have to provide the extension or the full path of the script so that CreateProcess can execute it. So this feature will be enabled if we add the function mentioned that searches for the script and returns the full path of the executable. 😄

@alexcrichton
Copy link
Member

Ok, thanks for the investigation! It sounds like though if PATHEXT is a feature of cmd.exe and not CreateProcess, though, then we may not want to implement this change? In theory Command is intended to model the underlying system as closely as possible, so in that sense we wouldn't want to tack on PATHEXT on top.

@steveklabnik
Copy link
Member

Ping: it's been a month, any chance we can move this forward?

@alexcrichton
Copy link
Member

It sounds like this behavior is specific to cmd.exe not CreateProcess. We're emulating the latter, so it sounds like we may not wish to merge this. Does that sound right @afiune?

@afiune
Copy link
Author

afiune commented Jan 6, 2017

Yup, it does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Windows] Command:spawn() only searches for .exe files
7 participants