Deal with windows file locking rustup.exe + any proxies when they are executed #2441

rbtcollins · 2020-07-27T12:17:46Z

Overview

During self update, rustup needs to replace rustup and the various proxies for cargo, rustc etc.

If it cannot do this, then respectively:

future invocations of rustup will be still be running the older version and it will attempt to self-update again
any improvements made to the proxy code paths will not be reflected in the live proxies on disk (and conversely any bugs will remain)

However, on Windows, when a binary is executed, it gets a read lock taken out on it. This prevents deleting (or replacing) the file until the process has exited. Renaming seems possible.

Today we have code to spawn a separate process from rustup to perform these replacements, but this will fail if any rust or rustup process is running concurrent with the self-update.

Specifically the following cases cause errors:

processes that are already running like rls running under an IDE
rust processes started immediately after 'rustup self update' exits

What do these errors look like?

There are two cases:

A) Failed updates of rustup.exe / proxies

error: could not copy file from 'C:\Users\<username>\.cargo\bin\rustup-init.exe' to 'C:\Users\<username>\.cargo\bin\rustup.exe'
info: caused by: Access is denied. (os error 5)

B) Failed removal of the updater binary

error: could not remove 'setup' file: 'C:\Users\<username>\.cargo\bin/rustup-init.exe'
error: caused by: Access is denied. (os error 5)

Design space

We need to cater for processes that are already running - they need to be shut down before the updates in step A can proceed.

We need to cater for processes that start up after parent process exits: short lived ones (e.g. rustc, rustup version) can be allowed to complete, but long running processes (e.g. rls) need to be shut down just like processes that were already running.

We can't prevent the OS starting the processes (not even by renaming, since even a rename may race with that), so its more a matter of tolerating processes that startup and having a way to tell them to exit rapidly. Whatever 'process should exit' mechanism will act to guard against case B.

Finally, we need to cater for file locks coming from other utilities like mcafee or various endpoint security products that may (hopefully temporarily) lock any of these files that we need to manipulate; but the same retry mechanisms should provide some resilience; but possibly having some interface to signal that a given process is holding a lock will be needed: that can be left for a separate ticket though, as it will, unlike this ticket, not be a systemic problem.

Final finally we need to deal with having two instances of the installer executing at once - but perhaps that can be a separate enhancement.

The text was updated successfully, but these errors were encountered:

kinnison · 2020-07-28T08:00:40Z

I think it's also worth noting that whatever solution we come up with here either needs to also work on Linux/Mac/*NIX in general, or else be isolated to windows-only pathways.

I think the description of this issue is good.

stefanmohl · 2020-08-24T15:09:12Z

It seems that this issue is causing problems on Cygwin as well, at least guessing so in #1508

steveklabnik · 2020-12-04T00:16:20Z

I am trying to rustup update nightly and getting

info: checking for self-updates
info: downloading self-update
error: could not remove 'rustup-bin' file: 'C:\Users\steve\.cargo\bin\rustup.exe'
error: caused by: permission denied

I believe that this is the right ticket?

kinnison · 2020-12-04T09:35:18Z

Hi Steve, yes this is the right ticket. Are you certain there're no IDEs or anything open which could be using rustup, rls, rustc, etc. Even via another program such as rust-analyzer? If you're sure and still getting that, then rebooting may be your only hope, and then run rustup self update on a freshly booted system. Windows' file locking makes our update process a bit fraught though we have been thinking about ways to try and sort it out.

steveklabnik · 2020-12-04T19:30:33Z

So, I did make sure to close out Code.

However, thinking about it... I do have one hunch. I am using nushell as my shell, installed via Cargo. Could that cause this problem?

kinnison · 2020-12-04T21:39:59Z

Unless nushell was running cargo or rustup or similar, not as far as I know, no.

kinnison · 2020-12-04T21:41:29Z

One idea, run task manager and look through the process list for any of rustc cargo rls rustup etc. ?

doxxx · 2020-12-04T23:48:50Z

Process Explorer has a feature where you can search open handles to see which process has a file locked.

steveklabnik · 2020-12-08T15:52:19Z

Mysteriously, this morning when I tried to get back to this, it worked. Hm. I'll give all that a try if I run into it again in the future, thanks / sorry.

kinnison · 2020-12-08T15:57:41Z

No worries, this part of rustup on Windows is very difficult to predict sometimes. Especially with background virus scanners etc. which might decide to hold a handle on the binary for a bit at any time.

workingjubilee · 2021-04-29T20:29:29Z

@rustbot label: +O-windows

rbtcollins · 2021-07-11T20:25:10Z

Ok so I think we can perhaps experiment with a few things.

Here's a proposed design, and why:

Lets assume we need to update the proxies from time to time: any optimisation such as having non-updated proxies can wait, since we will have to update them sometimes.

Basic approach

Proxies are taught how to self-exit when an update is in progress
Proxies will refuse to start up when an update is in progress
Proxies will attempt to trigger a graceful exit of their child (e.g. close stdin and stdout): we might try a few different means over time; we don't need to kill rustc, but we do want to kill rls (doesn't exit on its own) and cargo (may run for long periods).
rustup marks an update in process before it starts writing to any toolchain or proxy
rustup marks updates as finishes on exit

Marking updates

We don't need strong atomicity guarantees for the update marking mechanism, as it can safely run a smaller window than the lockout of concurrent update/install operations.

If we just write a file, we may leak it on panics or poweroffs. We can recover automatically if we treat such a file already existing as normal and delete on completion. Alternatively we could use a file as a means to lock out concurrent updates, tackling (some of) #988 at the same time. A file with a pid written into it would allow detection of panic-exits: a file without <pid\n> isn't valid, indicating an early panic.

Proposed first iteration: a file with a pid in it; if the file exists without a pid, or the pid is for a non-existing or non-rustup-init|rustup process, then delete it, otherwise error.

ChrisDenton · 2021-08-25T15:03:21Z

However, on Windows, when a binary is executed, it gets a read lock taken out on it. This prevents renaming (or replacing) the file until the process has exited.

Surely renaming will work so long as it stays within the same filesystem?

rbtcollins · 2021-08-26T06:59:41Z

The Windows NT Dynamic Link Loader opens files with FILE_SHARE_READ(https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew#parameters for the sematics of that). Renaming a file requires FILE_SHARE_DELETE access, which is not possible if an earlier opener has opened the file without FILE_SHARE_DELETE.
Renames are done by opening a file read-write and using https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfileinformationbyhandle to alter the status of the open file.

TL;DR this is not POSIX, And using FILE_FLAG_POSIX_SEMANTICS via https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-reopenfile doesn't undo the earlier lock modes as far as I know, it only allows applications that opt in to the semantics to have them.

ChrisDenton · 2021-08-26T09:07:56Z

Ok but this stackoverflow answer says (my bold):

Deleting will fail if any other process has the file opened without delete sharing, renaming is never a problem

And indeed if I try it, it works (Windows 10 21H1):

// old.rs
#![cfg(windows)]

use std::{env, fs, io};

fn main() -> io::Result<()> {
    let mut new_exe = env::current_exe()?;
    new_exe.set_file_name("new.exe");
    fs::rename(&env::current_exe()?, &new_exe)?;
    
    println!("success!");
    Ok(())
}

kinnison · 2021-08-26T09:13:20Z

@ChrisDenton Does this work if the rename call comes from not that process? i.e. renaming myself might be okay if I'm the only execution context with it; but if there's more than one process from the same exe does it still work?

ChrisDenton · 2021-08-26T09:26:00Z

@kinnison Yes, it appears to work.

// wait.rs -- stays open while it waits for input
use std::io::{self, Read};
fn main() {
    io::stdin().bytes().next();
}

// rename.rs -- renames wait.exe to new.exe
use std::{fs, io};
fn main() -> io::Result<()> {
    fs::rename("wait.exe", "new.exe")?;
    Ok(())
}

Though I've not tried it on a clean system so I can't rule out the possibility I've changed some setting somewhere that enables this behaviour.

UPDATE: I tried on a Windows 10 laptop (Home edition) that is not used by a developer. Same result.

kinnison · 2021-09-04T10:57:26Z

Interesting, thanks Chris. @rbtcollins do you think we can use the above to make changes to how we install updates, or should we also try and get tests on older Windows versions?

rbtcollins · 2021-09-19T20:19:32Z

Theres certainly something interesting going on here. We could get some research on slower-ring releases, because I don't recall this being how it used to be. Deleting the file is still not possible, at least without using FILE_DISPOSITION_POSIX_SEMANTICS. We could move files to the recycle bin perhaps.

As to how we use this: the question I think is going to be over failure modes. If a rustup proxy is in the recycle bin when it spawns a child (race condition obviously) will that break? If someone updates rustup multiple times, will that break? What about rustc / cargo - if cargo.exe is in recycle bin with a random file name, and goes to spawn a new rustc, will that work or will it perhaps consider files in recycle bin - and perhaps open up a security vulnerability at the same time.

There is obviously a complexity tradeoff here, but leaving a running process, no longer in the directory it thought it was in, still trying to spawn new processes doesn't seem great.

And the situation on linux is similarly indeterminate - sure, an unlinked but still referenced executable can be forked(), but its behaviour once the containing directory is removed is also going to be indeterminate.

So I think to avoid unexpected cases cropping up it is worth having some complexity - making sure we actually kill process groups off, rather than inviting undefined behaviour

ChrisDenton · 2021-09-19T21:22:42Z

I looked into this more since my last post. It seems the executable loader is using CreateFileMapping (well, the NT equivalent) to map the executable. which makes sense. From testing this, it appears this means there's a special "no delete" lock on the file which (unlike missing FILE_SHARE_DELETE) still allows renames. This kind of lock appears to be unique to file mappings.

Sample program

use std::{ffi::c_void, fs, io::{self, Write}};
use std::os::windows::io::AsRawHandle;

fn main() -> io::Result<()> {
    {
        // Create a file with some contents and close the handle.
        let mut f = fs::File::create("test.txt")?;
        f.write_all(b"hello")?;
    }
    // Reopen the file for reading.
    let f = fs::File::open("test.txt")?;
    
    // Create the file mapping.
    unsafe {
        let h = CreateFileMappingW(
            f.as_raw_handle() as _,
            std::ptr::null(),
            PAGE_READONLY,
            0,
            0,
            std::ptr::null(),
        );
        if h == 0 {
            return Err(io::Error::last_os_error());
        }
    }
    print!("Delete the file... ");
    match fs::remove_file("test.txt") {
        Ok(_) => println!("Success!"),
        Err(e) => println!("{}", e),
    }
    print!("Rename the file... ");
    match fs::rename("test.txt", "something.else") {
        Ok(_) => println!("Success!"),
        Err(e) => println!("{}", e),
    }
    Ok(())
}

const PAGE_READONLY: u32 = 2;
#[link(name="kernel32")]
extern "system" {
    fn CreateFileMappingW(
        hFile: usize,
        lpFileMappingAttributes: *const c_void,
        flProtect: u32,
        dwMaximumSizeHigh: u32,
        dwMaximumSizeLow: u32,
        lpName: *const u16
    ) -> usize;
}

Output

Delete the file... Access is denied. (os error 5)
Rename the file... Success!

A process can tell if its file name has been renamed by using QueryFullProcessImageNameW to get the new file path (though weirdly it won't notice if a parent directory is renamed). This differs from std::env::current_exe which will only tell you the name that was used when it was first loaded (it uses GetModuleFileNameW under the hood). Though I'm not sure how useful that information is.

But yeah, I would agree leaving a process running after it's been pseudo-removed isn't great. I guess you could use a special "pending deletion" directory that rustup controls but that seems like it would have its own issues.

rbtcollins · 2021-09-19T21:57:59Z

Yeah. so in Cygwin we used the underlying NT APIs to simulate POSIX behaviour, but there's a raft of corner cases :/

Self-update on Windows is hard... And rustup occasionally fails to self-update. There is a [related issue][1] to fix this behavior, but until then, using `--no-self-update` is a decent workaround. [1]: rust-lang/rustup#2441

By default `rusup` performs a self-update when installing a new rustc toolchain, Sometimes, the files are locked by windows, and the self update can fail as described here: rust-lang/rustup#2441 Since this is a pipeline run, we have no need to update rustup, and can rely on the version provided by the runner. By disabling the self update, the file conflict `The process cannot access the file because it is being used by another process` no longer appears.

rbtcollins added the enhancement label Jul 27, 2020

Darleev mentioned this issue Jul 27, 2020

Rust self-update will block builds actions/runner-images#246

Closed

5 tasks

rustbot added the O-windows Windows related label Apr 29, 2021

This was referenced May 21, 2021

rustup self update fails when RLS using IDE's are open on Windows #1865

Closed

improve rustc wrapper startup time? #2626

Open

rbtcollins mentioned this issue Jun 27, 2021

rustup fails to self-update on Windows (sometimes?) #2803

Closed

rbtcollins mentioned this issue Aug 25, 2021

Lock bootstrap (x.py) build directory rust-lang/rust#88310

Merged

Gankra mentioned this issue Feb 16, 2023

rustup randomly dies updating itself in CI? (windows-only?) axodotdev/cargo-dist#108

Closed

kylewillmon mentioned this issue Apr 25, 2023

Use --no-self-update for rustup on Windows phylum-dev/cli#1061

Merged

mitsuhiko mentioned this issue May 20, 2023

Support Hardlinked Shim/Proxy Deletion mitsuhiko/self-replace#15

Open

Javagedes mentioned this issue Mar 13, 2024

Prevent rustup self update microsoft/mu_devops#327

Merged

rochdev mentioned this issue Mar 17, 2024

fix windows rustup race condition on self update DataDog/action-prebuildify#40

Merged

rami3l added this to the On Deck milestone Mar 18, 2024

rami3l mentioned this issue Mar 18, 2024

rustup 1.27 fails when installing targets on Windows in GitHub Actions #3709

Closed

jayzhan211 mentioned this issue Mar 18, 2024

Suppress self update for windows CI runner apache/datafusion#9661

Merged

peytondmurray mentioned this issue May 6, 2024

[PYPI] Fix broken publish action deshaw/nbstripout-fast#14

Merged

rami3l mentioned this issue Jul 11, 2024

Tracking: Ensure the integrity of toolchains, handle concurrent reads/writes #3937

Open

rami3l mentioned this issue Oct 14, 2024

rustup update failing with could not rename component file #2729

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deal with windows file locking rustup.exe + any proxies when they are executed #2441

Deal with windows file locking rustup.exe + any proxies when they are executed #2441

rbtcollins commented Jul 27, 2020 •

edited

Loading

kinnison commented Jul 28, 2020

stefanmohl commented Aug 24, 2020

steveklabnik commented Dec 4, 2020

kinnison commented Dec 4, 2020

steveklabnik commented Dec 4, 2020

kinnison commented Dec 4, 2020

kinnison commented Dec 4, 2020

doxxx commented Dec 4, 2020

steveklabnik commented Dec 8, 2020

kinnison commented Dec 8, 2020

workingjubilee commented Apr 29, 2021

rbtcollins commented Jul 11, 2021

ChrisDenton commented Aug 25, 2021

rbtcollins commented Aug 26, 2021

ChrisDenton commented Aug 26, 2021

kinnison commented Aug 26, 2021

ChrisDenton commented Aug 26, 2021 •

edited

Loading

kinnison commented Sep 4, 2021

rbtcollins commented Sep 19, 2021

ChrisDenton commented Sep 19, 2021

rbtcollins commented Sep 19, 2021

Deal with windows file locking rustup.exe + any proxies when they are executed #2441

Deal with windows file locking rustup.exe + any proxies when they are executed #2441

Comments

rbtcollins commented Jul 27, 2020 • edited Loading

Overview

What do these errors look like?

Design space

kinnison commented Jul 28, 2020

stefanmohl commented Aug 24, 2020

steveklabnik commented Dec 4, 2020

kinnison commented Dec 4, 2020

steveklabnik commented Dec 4, 2020

kinnison commented Dec 4, 2020

kinnison commented Dec 4, 2020

doxxx commented Dec 4, 2020

steveklabnik commented Dec 8, 2020

kinnison commented Dec 8, 2020

workingjubilee commented Apr 29, 2021

rbtcollins commented Jul 11, 2021

Basic approach

Marking updates

ChrisDenton commented Aug 25, 2021

rbtcollins commented Aug 26, 2021

ChrisDenton commented Aug 26, 2021

kinnison commented Aug 26, 2021

ChrisDenton commented Aug 26, 2021 • edited Loading

kinnison commented Sep 4, 2021

rbtcollins commented Sep 19, 2021

ChrisDenton commented Sep 19, 2021

rbtcollins commented Sep 19, 2021

rbtcollins commented Jul 27, 2020 •

edited

Loading

ChrisDenton commented Aug 26, 2021 •

edited

Loading