Use lockfile on incr. comp. directory #32754

nikomatsakis · 2016-04-05T22:07:48Z

Currently we are not detecting the case where multiple rustc processes are using the same incr. comp directory. That is not good.

alexcrichton · 2016-04-05T22:16:38Z

FWIW Cargo recently forayed into lock files, and it basically just ended up using the fs2 crate's locking methods (namely flock on Unix I believe). So far seems to be working out well at least!

michaelwoerister · 2016-07-21T17:29:59Z

I've been thinking about a different synchronization mechanism lately, one that takes more of a persistent data structure than a mutex approach. It goes roughly like this:

When starting a compilation session, create a sub-directory <session-id>-working within the incr. comp. directory, where <session-id> is a newly generated UUID.
Spawn a "cleanup process" that will delete this directory if the compiler process gets killed unexpectedly and consequently fails to send the periodic keep-alive signal to the cleanup process.
Choose a valid cache sub-directory from the incr. comp. directory. A valid cache-directory formerly was a <session-id>-working directory, but has been renamed to <session-id>-<timestamp> after compilation had succeeded. The timestamp in the directory name allows to choose the newest one, if there is more than one available.
From the chosen directory, copy/hard-link everything into the new <session-id>-working directory. Now we have our own copy of everything and don't need to worry about other processes interfering.
Validate the data we have just copied, purge invalidated items from the cache, etc.
Compile the program
Once we have a valid consistent cache directory again, rename the cache directory from <session-id>-working to <session-id>-<timestamp> thus making it visible to other processes. That's save now because the files in it won't be mutated again.
Shut down the clean-up process, its services have not been needed.
Delete the old cache directory. It's not needed anymore, since we just created a more recent one.

The most important advantage of this strategy is that we are never left with a corrupt cache. If the rustc process is killed midway through compilation, it will leave behind a <session-id>-working directory that we know is unusable. And if the cleanup process does its job, not even that will be left behind. In contrast to that, with a filelock there might be half-written object files and we have to throw away everything (although that could be improved upon too with a similar "change the filename when done" strategy).
Another advantage is that two rustc processes can work concurrently. With the filelock, one will be blocked or will fail immediately.

The disadvantages are that:

the "cleanup process" setup is a bit ugly.
making the initial "copy" of the working directory might be slow if the underlying file system does not support hard-links (which is at least true for FAT32 and exFAT).
it needs some kind of IPC to facilitate communicating between rustc and its cleanup process. Although that could conceivably be done through the file-system too.

Thoughts?

alexcrichton · 2016-07-21T17:56:30Z

In the last step the old cache directory is deleted, but concurrent rustc invocations means that the old cache may be in use, right? Basically step (3) needs to be resilient to the cache disappearing, but that seems reasonable?

I'm somewhat wary about spawning a cleanup process, the behavior on Unix at least for a ctrl-c is to typically kill the entire process group, which in this case includes the cleanup process. In that sense if the compiler abnormally terminates so will the child. On Windows the behavior is somewhat similar with what we do today where the parent and child process are both in the same "job object", which when closed will kill both.

I don't really have a great suggestion though, but perhaps we could just leave around these directories or maybe cleanup > 24hr old ones automatically?

michaelwoerister · 2016-07-21T18:09:55Z

Regarding the concurrent access: we should be fine with missing files. If the file is actually copied and there is any chance of a halfway copied file, we'd also need to store some kind of checksum to prevent those from being used.

Regarding cleanup: I was thinking about the 'delete unreasonably old working dirs' approach too, as an additional backup. It would probably also be OK to do only that.

Is there no way to spawn an independent process (i.e. one that can outlive its spawner)?

On July 21, 2016 7:56:45 PM GMT+02:00, Alex Crichton notifications@github.com wrote:

In the last step the old cache directory is deleted, but concurrent
rustc invocations means that the old cache may be in use, right?
Basically step (3) needs to be resilient to the cache disappearing, but
that seems reasonable?

I'm somewhat wary about spawning a cleanup process, the behavior on
Unix at least for a ctrl-c is to typically kill the entire process
group, which in this case includes the cleanup process. In that sense
if the compiler abnormally terminates so will the child. On Windows the
behavior is somewhat similar with what we do today where the parent and
child process are both in the same "job object", which when closed will
kill both.

I don't really have a great suggestion though, but perhaps we could
just leave around these directories or maybe cleanup > 24hr old ones
automatically?

You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#32754 (comment)

alexcrichton · 2016-07-21T18:12:24Z

We can emulate spawning an independent process on Unix by changing process groups at least, but it does seem like a surprising thing for a compiler to do, so I'd be somewhat wary of doing that. On Windows with job objects at least I believe it's impossible to spawn an independent process.

nikomatsakis · 2016-07-21T18:25:55Z

@michaelwoerister sounds pretty related to #34957

michaelwoerister · 2016-07-21T18:37:49Z

We can emulate spawning an independent process on Unix by changing process groups at least, but it does seem like a surprising thing for a compiler to do, so I'd be somewhat wary of doing that. On Windows with job objects at least I believe it's impossible to spawn an independent process.

Yes, it is a bit surprising. On the other hand, I personally would be OK with the compiler starting a daemon process to support incremental compilation. Especially if it's just an enhancement that keeps my disk from filling up. For example, the way Sublime Text build systems work, I have to manually kill long-running rustc processes all the time. I'd probably end up with a few hundred unused working directories at any point in time if they only get cleaned up after 24 hours :)

@nikomatsakis Yes, definitely related.

michaelwoerister · 2016-07-21T18:41:10Z

Maybe a rustc process could use a filelock on its working directory. Then, everytime a new process starts, it tries to delete all working directories it finds. It can only succeed if the process holding the lock to its own working dir has died.

alexcrichton · 2016-07-21T20:27:42Z

File locks work pretty well minus NFS, but we could have a scheme like:

First, you make a scratch directory
Then you crate a "well known" file in it, acquiring an exclusive lock
When cleaning out old directories (maybe those >2s old), you try to grab the lock
If grabbing the lock fails, the process must be running
If grabbing the lock succeeds, the process must have died so you clean it up. This also sorts out contention among multiple compilers cleaning up

This doesn't help for NFS, but I'm not sure there's really much we can do there?

michaelwoerister · 2016-07-22T10:40:18Z

That sounds like a pretty promising approach. NFS we won't be able to support, I'm afraid, but that seems OK to me.
What about filesystems that don't support hardlinking? A lot of potential data copying is involved there (which also widens the time window for two process interfering with each other). Can we tell people to "just use a proper file-system"? :)

alexcrichton · 2016-07-22T16:06:06Z

On Windows we actually can't delete a file if someone's opened it, and on Unix if you delete a file I believe all open references still have the same underlying data, so in theory that should be roughly ok if hard linking isn't supported? I'd also be fine saying that we don't work great on things like FAT32. Or at least not to start out

michaelwoerister · 2016-08-09T15:54:20Z

Since not addressing this issue might lead to a lot of unreproducible bug reports because of a corrupted cache directory, we decided to add it to the alpha milestone.

michaelwoerister · 2016-08-12T13:54:08Z

@alexcrichton How far, do think, are we away from being able to use dependencies from crates.io in rustc? The fs2 crate looks exactly like we need. But I guess at the moment I have to re-implement file-locking.

alexcrichton · 2016-08-12T16:10:19Z

@michaelwoerister still pretty far unfortunately, I'd plan on vendoring for now.

…ing, r=nikomatsakis Implement synchronization scheme for incr. comp. directory This PR implements a copy-on-write-based synchronization scheme for the incremental compilation cache directory. For technical details, see the documentation at the beginning of `rustc_incremental/persist/fs.rs`. The PR contains unit tests for some functions but for testing whether the scheme properly handles races, a more elaborate test setup would be needed. It would probably involve a small tool that allows to manipulate the incremental compilation directory in a controlled way and then letting a compiler instance run against directories in different states. I don't know if it's worth the trouble of adding another test category to `compiletest`, but I'd be happy to do so. Fixes rust-lang#32754 Fixes rust-lang#34957

…crichton Implement synchronization scheme for incr. comp. directory This PR implements a copy-on-write-based synchronization scheme for the incremental compilation cache directory. For technical details, see the documentation at the beginning of `rustc_incremental/persist/fs.rs`. The PR contains unit tests for some functions but for testing whether the scheme properly handles races, a more elaborate test setup would be needed. It would probably involve a small tool that allows to manipulate the incremental compilation directory in a controlled way and then letting a compiler instance run against directories in different states. I don't know if it's worth the trouble of adding another test category to `compiletest`, but I'd be happy to do so. Fixes #32754 Fixes #34957

nikomatsakis added the A-incr-comp Area: Incremental compilation label Apr 5, 2016

michaelwoerister added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 21, 2016

michaelwoerister mentioned this issue Jul 22, 2016

Enable reuse of .o files if nothing has changed #34956

Merged

nikomatsakis mentioned this issue Jul 30, 2016

ICEs with incremental compilation building glium #34991

Closed

michaelwoerister added this to the Incremental compilation alpha milestone Aug 9, 2016

michaelwoerister self-assigned this Aug 9, 2016

michaelwoerister mentioned this issue Aug 16, 2016

Implement synchronization scheme for incr. comp. directory #35718

Merged

bors closed this as completed in #35718 Sep 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use lockfile on incr. comp. directory #32754

Use lockfile on incr. comp. directory #32754

nikomatsakis commented Apr 5, 2016

alexcrichton commented Apr 5, 2016

michaelwoerister commented Jul 21, 2016

alexcrichton commented Jul 21, 2016

michaelwoerister commented Jul 21, 2016

alexcrichton commented Jul 21, 2016

nikomatsakis commented Jul 21, 2016

michaelwoerister commented Jul 21, 2016

michaelwoerister commented Jul 21, 2016 •

edited

Loading

alexcrichton commented Jul 21, 2016

michaelwoerister commented Jul 22, 2016

alexcrichton commented Jul 22, 2016

michaelwoerister commented Aug 9, 2016

michaelwoerister commented Aug 12, 2016

alexcrichton commented Aug 12, 2016

Use lockfile on incr. comp. directory #32754

Use lockfile on incr. comp. directory #32754

Comments

nikomatsakis commented Apr 5, 2016

alexcrichton commented Apr 5, 2016

michaelwoerister commented Jul 21, 2016

alexcrichton commented Jul 21, 2016

michaelwoerister commented Jul 21, 2016

alexcrichton commented Jul 21, 2016

nikomatsakis commented Jul 21, 2016

michaelwoerister commented Jul 21, 2016

michaelwoerister commented Jul 21, 2016 • edited Loading

alexcrichton commented Jul 21, 2016

michaelwoerister commented Jul 22, 2016

alexcrichton commented Jul 22, 2016

michaelwoerister commented Aug 9, 2016

michaelwoerister commented Aug 12, 2016

alexcrichton commented Aug 12, 2016

michaelwoerister commented Jul 21, 2016 •

edited

Loading