Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cargo not doing anything when the input and output mtimes are equal #5919

Merged
merged 3 commits into from
Aug 22, 2018

Conversation

RalfJung
Copy link
Member

That's a problem as the input may have changed in that same second but after the output got generated!

Fixes #5918

That's a problem as the input may have changed in that same second but after the output got generated!
@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@RalfJung
Copy link
Member Author

Note that I do not fully understand why this fixes the problem. The comparison is (from what I can see) at type FileTime, which should take nanoseconds into account.

Also, the RUST_LOG=cargo=info log contains some strange lines (even without this path) like

 INFO 2018-08-21T14:02:09Z: cargo::core::compiler::fingerprint: fingerprint error for issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions): mtime based components have changed: previously Some(FileTime { seconds: 1534860129, nanos: 715599412 }) now None, paths are ".fingerprint/issue-32278-big-array-of-strings-ecacd88b5919f967/dep-bin-issue_32278_big_array_of_strings-ecacd88b5919f967" and ".fingerprint/issue-32278-big-array-of-strings-ecacd88b5919f967/dep-bin-issue_32278_big_array_of_strings-ecacd88b5919f967"

Notice the "now None". Why does the file not have a timestamp any more...?

Copy link
Member

@dwijnand dwijnand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy 😄

Would be nice to capture #5918 in a test case, but LGTM as is nonetheless!

@RalfJung
Copy link
Member Author

That'd be a non-deterministic test case, which I am not sure how to even pull off. But yeah, would be nice indeed.^^

@RalfJung
Copy link
Member Author

I extended logging to always print the timestamps it is comparing, and indeed I am seeing

 INFO 2018-08-21T14:23:00Z: cargo::core::compiler::fingerprint: stale? /home/r/src/rust/rustc-perf/collector/benchmarks/coercions/src/main.rs -- 1534861380.504535670s vs 1534861380.504535670s

However, there has definitely been a touch src/main.rs since the last run. How is that possible?!?

@dwijnand
Copy link
Member

dwijnand commented Aug 21, 2018

Given this fix, couldn't you trigger p.cargo("build") 10 times and check it definitely runs rustc 10 times, as opposed to less than 10 times before? Just need to make it cross the 1 second boundary. There's also some time traveling capabilities in the test suite that might come in hand (though you're testing for == times).

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2018

More debugging reveals

  File: src/main.rs
  Size: 1950285   	Blocks: 3816       IO Block: 4096   regular file
Device: fd02h/64770d	Inode: 1974246     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/       r)   Gid: ( 1000/       r)
Access: 2018-08-21 16:24:24.807921535 +0200
Modify: 2018-08-21 16:24:24.807921535 +0200
Change: 2018-08-21 16:24:24.807921535 +0200
 Birth: -
 WARN 2018-08-21T14:24:24Z: cargo::util::rustc: failed to calculate rustc fingerprint: probably rustup rustc, but without rustup's env vars
 INFO 2018-08-21T14:24:24Z: cargo::util::rustc: rustc info cache disabled
 INFO 2018-08-21T14:24:24Z: cargo::util::rustc: rustc info cache miss
 INFO 2018-08-21T14:24:24Z: cargo::util::rustc: rustc info cache miss
 INFO 2018-08-21T14:24:24Z: cargo::util::rustc: rustc info cache miss
 INFO 2018-08-21T14:24:24Z: cargo::core::compiler::fingerprint: stale? /home/r/src/rust/rustc-perf/collector/benchmarks/coercions/src/main.rs -- 1534861464.807921535s vs 1534861464.807921535s
 INFO 2018-08-21T14:24:24Z: cargo::core::compiler::fingerprint: fingerprint error for issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions): mtime based components have changed: previously Some(FileTime { seconds: 1534861464, nanos: 807921535 }) now None, paths are ".fingerprint/issue-32278-big-array-of-strings-ecacd88b5919f967/dep-bin-issue_32278_big_array_of_strings-ecacd88b5919f967" and ".fingerprint/issue-32278-big-array-of-strings-ecacd88b5919f967/dep-bin-issue_32278_big_array_of_strings-ecacd88b5919f967"
 INFO 2018-08-21T14:24:24Z: cargo::core::compiler::context::compilation_files: Target filenames: [OutputFile { path: "/home/r/src/rust/rustc-perf/collector/benchmarks/coercions/target/debug/deps/issue_32278_big_array_of_strings-ecacd88b5919f967", hardlink: Some("/home/r/src/rust/rustc-perf/collector/benchmarks/coercions/target/debug/issue-32278-big-array-of-strings"), flavor: Normal }]
 INFO 2018-08-21T14:24:24Z: cargo::core::compiler::job_queue: start: issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions) => Target(bin: issue-32278-big-array-of-strings)/Profile(dev) => Host
   Compiling issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions)
 INFO 2018-08-21T14:24:25Z: cargo::core::compiler::job_queue: end: issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions) => Target(bin: issue-32278-big-array-of-strings)/Profile(dev) => Host
    Finished dev [unoptimized + debuginfo] target(s) in 0.65s

  File: src/main.rs
  Size: 1950285   	Blocks: 3816       IO Block: 4096   regular file
Device: fd02h/64770d	Inode: 1974246     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/       r)   Gid: ( 1000/       r)
Access: 2018-08-21 16:24:25.467916749 +0200
Modify: 2018-08-21 16:24:25.467916749 +0200
Change: 2018-08-21 16:24:25.467916749 +0200
 Birth: -
 WARN 2018-08-21T14:24:25Z: cargo::util::rustc: failed to calculate rustc fingerprint: probably rustup rustc, but without rustup's env vars
 INFO 2018-08-21T14:24:25Z: cargo::util::rustc: rustc info cache disabled
 INFO 2018-08-21T14:24:25Z: cargo::util::rustc: rustc info cache miss
 INFO 2018-08-21T14:24:25Z: cargo::util::rustc: rustc info cache miss
 INFO 2018-08-21T14:24:25Z: cargo::util::rustc: rustc info cache miss
 INFO 2018-08-21T14:24:25Z: cargo::core::compiler::fingerprint: stale? /home/r/src/rust/rustc-perf/collector/benchmarks/coercions/src/main.rs -- 1534861465.467916749s vs 1534861465.467916749s
 INFO 2018-08-21T14:24:25Z: cargo::core::compiler::fingerprint: fingerprint error for issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions): mtime based components have changed: previously Some(FileTime { seconds: 1534861465, nanos: 467916749 }) now None, paths are ".fingerprint/issue-32278-big-array-of-strings-ecacd88b5919f967/dep-bin-issue_32278_big_array_of_strings-ecacd88b5919f967" and ".fingerprint/issue-32278-big-array-of-strings-ecacd88b5919f967/dep-bin-issue_32278_big_array_of_strings-ecacd88b5919f967"
 INFO 2018-08-21T14:24:25Z: cargo::core::compiler::context::compilation_files: Target filenames: [OutputFile { path: "/home/r/src/rust/rustc-perf/collector/benchmarks/coercions/target/debug/deps/issue_32278_big_array_of_strings-ecacd88b5919f967", hardlink: Some("/home/r/src/rust/rustc-perf/collector/benchmarks/coercions/target/debug/issue-32278-big-array-of-strings"), flavor: Normal }]
 INFO 2018-08-21T14:24:25Z: cargo::core::compiler::job_queue: start: issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions) => Target(bin: issue-32278-big-array-of-strings)/Profile(dev) => Host
   Compiling issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions)
 INFO 2018-08-21T14:24:26Z: cargo::core::compiler::job_queue: end: issue-32278-big-array-of-strings v0.1.0 (file:///home/r/src/rust/rustc-perf/collector/benchmarks/coercions) => Target(bin: issue-32278-big-array-of-strings)/Profile(dev) => Host
    Finished dev [unoptimized + debuginfo] target(s) in 0.70s

The stat output happens right before cargo is called. The LHS in the printed stale? check (that's mtime2) always matches the stat time. But the RHS is NOT the stat time from the previous run! It is something else, and indeed sometimes has equal nano-second timestamp to the stat time?!?

Ah it says right in the code what that is, it's the output.

@RalfJung
Copy link
Member Author

Oy, this breaks on apple

--- build_script::rebuild_only_on_explicit_paths stdout ----
running `/Users/travis/build/rust-lang/cargo/target/debug/cargo build -v`
run without
running `/Users/travis/build/rust-lang/cargo/target/debug/cargo build -v`
run with
running `/Users/travis/build/rust-lang/cargo/target/debug/cargo build -v`
run with2
running `/Users/travis/build/rust-lang/cargo/target/debug/cargo build -v`
thread 'build_script::rebuild_only_on_explicit_paths' panicked at '
Expected: execs
    but: differences:
  0 - |[FRESH] foo v0.5.0 ([..])|
    + |   Compiling foo v0.5.0 (file:///Users/travis/build/rust-lang/cargo/target/cit/t319/foo)|
  1 - |[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]|
    + |     Running `/Users/travis/build/rust-lang/cargo/target/cit/t319/foo/target/debug/build/foo-46e097b20049f547/build-script-build`|
  2 -
    + |     Running `rustc --crate-name foo src/lib.rs --crate-type lib --emit=dep-info,link -C debuginfo=2 -C metadata=c8f169addf9743bd -C extra-filename=-c8f169addf9743bd --out-dir /Users/travis/build/rust-lang/cargo/target/cit/t319/foo/target/debug/deps -L dependency=/Users/travis/build/rust-lang/cargo/target/cit/t319/foo/target/debug/deps`|
  3 -
    + |    Finished dev [unoptimized + debuginfo] target(s) in 0.56s|

Essentially, the old code had a race condition that could lead to spuriously not rebuilding when we should. The new code has a race that could lead to spuriously rebuilding when we should not... we can only have one or the other, and the latter is clearly less buggy.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2018

I have a problem implementing the test... first I used this to "touch" the file:

        let _ = OpenOptions::new()
            .write(true)
            .open(root.join("src/main.rs"))
            .unwrap();

but then the test fails even with the fix, so I guess it doesn't really touch?
So Instead I changed it to

        {
            let mut file = OpenOptions::new()
                .append(true)
                .open(root.join("src/main.rs"))
                .unwrap();
            file.write(" ".as_bytes()).unwrap();
        }

but now the test passes even without the fix...

EDIT:
I now also tried

        let now = FileTime::from_system_time(SystemTime::now());
        set_file_times(&path, now, now).unwrap();

to no avail. The mtime changes, but the bug does not occur.

Maybe it has to do with this being on /tmp, which is a different file system?

EDIT2: Nope it tests inside the target dir.

@RalfJung
Copy link
Member Author

I managed to make a proper test by calling

Command::new("touch").arg(path.to_str().unwrap()).output().unwrap();

but that won't work on Windows...

@RalfJung
Copy link
Member Author

I looked at strace, and set_file_times does almost the same thing as touch in terms of syscalls -- both use utimensat. They use it slightly differently, but... well.

I give up on adding a test case.

@alexcrichton
Copy link
Member

The comment here I think isn't quite right in the sense of, as you've found, the mtime comparison is comparing nanoseconds, not just seconds. I believe that some filesystems don't support nanosecond precision which means that the granularity may be seconds.

I'm not 100% certain on the fix here because it seems like it may cause rebuilds in some situations perhaps?

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2018

My filesystem however supports nanosecond precision. stat shows nanosecond timestamps, and the same timestamps occur in cargo's logs. I would hence expect to never see equal timestamps -- and yet I do.

I am suspecting some kind of timestamp caching somewhere to avoid querying the system clock too much.

I'm not 100% certain on the fix here because it seems like it may cause rebuilds in some situations perhaps?

Yes. In case the timestamps are equal, we just cannot know which event happened first. We have two choices:

  • Assume that we should NOT rebuild. That's what happens currently. However, this leads to cases where code that should be rebuilt is not.
  • Assume that we SHOULD rebuild. That's what I am proposing. When in doubt (i.e., when we inherently cannot know if the output file is newer than the input or not), I think it is prudent to err on the side of caution and assume that we have to regenerate the output. Worst case, we did some unnecessary work. In the alternative scenario (i.e., in how cargo works currently), worst case we do not do the work we are supposed to do -- we tell the user "your program is ready", but it is not.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2018

If you look at this assuming we just had 1-second granularity, then I think it is quite clear that we HAVE to rebuild when in doubt. Otherwise someone who compiles and then quickly edits in the same second will have their edits ignored, even if they call cargo build a minute later! cargo is like "last edit and program generation happened in the same second, so of course the program generation happened later". But there is no grounds for this assumption cargo is making.

We are using nanosecond granularity, but that does not change the fact that rebuilding on equal timestamps is (IMO) the only correct strategy. Nanosecond timestamps should mean that we never see equal timestamps, but alas, we do. And, as you said, there are file systems that only support 1-second granularity, and then again I find "rebuild on equal timestamp" to be the only reasonable choice.

@alexcrichton
Copy link
Member

I definitely agree that with less-than-nanosecond precision this patch makes sense, but what I'm trying to make sense of is that it's basically impossible this patch is necessary if there's nanosecond precision. There's no way that process interactions take less than a nanosecond, that's just practically impossible.

What I'm discovering, though, is that while nanosecond precision is supported it's not necessarily nanosecond-accurate (is that the right term?). If I write a tight loop that creates a file, gets the mtime, and then deletes the file (printing out the timing of each loop iteration) I get (on linux):

1534958807.003447057s
elapsed 85.401µs
1534958807.003447057s
elapsed 186.544µs
1534958807.003447057s
elapsed 275.098µs
1534958807.003447057s
elapsed 359.881µs
1534958807.003447057s
elapsed 444.296µs
1534958807.003447057s
elapsed 528.36µs

These operations are thousands of times slower than a nanosecond, yet the timestamp is always showing the same value! Some googling around shows https://stackoverflow.com/questions/14392975/timestamp-accuracy-on-ext4-sub-millsecond which mentions that the kernel caches the current time until the next timer interrupt, which would help explain this behavior.

In light of all this, can the comment be updated here? Perhaps something like:

Note that equal mtimes are considered "stale". For filesystems with not much timestamp precision like 1s this is a conservative approximation to handle the case where a file is modified within the same second after a build finishes. We want to make sure that incremental rebuilds pick that up!

For filesystems with nanosecond precision it's been seen in the wild that it's "nanosecond precision". It turns out that kernels may cache the current time so files created at different times actually list the same nanosecond precision. Some digging on #5919 picked up that the kernel caches the current time between timer ticks, which could mean that if a file is updated at most 10ms after a build finishes then Cargo may not pick up the build changes.

All in all, the equality check here is a conservative assumption that, if equal, files were changed just after a previous build finished. It's hoped this doesn't cause too many issues in practice!

@RalfJung
Copy link
Member Author

what I'm trying to make sense of is that it's basically impossible this patch is necessary if there's nanosecond precision. There's no way that process interactions take less than a nanosecond, that's just practically impossible.

That part is surprising to me as well. Thanks for the analysis! I will update the comment.

@RalfJung
Copy link
Member Author

That also explains, btw, why my tests did not work. touch calls utimensat with a 0 time so the FS determines the clock. filetime calls it with the previously determined system time, which is then stored correctly.

@RalfJung
Copy link
Member Author

Comment updated.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Aug 22, 2018

📌 Commit b1fbafd has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Aug 22, 2018

⌛ Testing commit b1fbafd with merge 502f0ae...

bors added a commit that referenced this pull request Aug 22, 2018
fix cargo not doing anything when the input and output mtimes are equal

That's a problem as the input may have changed in that same second but after the output got generated!

Fixes #5918
@bors
Copy link
Contributor

bors commented Aug 22, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 502f0ae to master...

@bors bors merged commit b1fbafd into rust-lang:master Aug 22, 2018
@Eh2406
Copy link
Contributor

Eh2406 commented Jan 4, 2019

@RalfJung, I know this is from a while ago, but do you know if there's a way to get filetime to do exactly what touch dose? cc #6477 (comment) and #6484 (comment)

@RalfJung
Copy link
Member Author

RalfJung commented Jan 5, 2019

@Eh2406 see #5919 (comment). IIRC touch sets the times argument of utimensat to NULL, while the Rust code determines the current time and uses that. I figured that would be the cause of the difference in behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cargo does not rebuild if project got changed in the same second
7 participants