Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add env var to override sys::Instant::actually_monotonic() for windows and unix #84448

Closed
wants to merge 2 commits into from

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Apr 22, 2021

The environment variable is read on first use and cached in an atomic.

Various operating systems promise monotonic clocks and so does hardware and various hypervisors, often with explicit flags such as constant_tsc. And if that flag is absent they already force the timers to be monotonic through atomic operations (just as the standard library does). And this tends to work on most systems.

But there's the occasional broken hardware or hypervisor that makes this promise but then doesn't deliver. That's why the standard library doesn't rely on the API guarantees in some cases (e.g. windows). And in other cases (e.g. x86 linux) it does trust the OS guarantee and then this gets broken because rust trusts the os which trusts the hypervisor which trusts the hardware which is broken.

The result is that either we err on the side of caution and introduce cache contention in an operation that should be very fast and perfectly scalable even on systems that have reliable clocks or we trust too much and our guarantees get violated on some fraction of systems.

With the environment variable we can offer a way out. We can make a default decision for a particular platform depending on how common broken hardware is and then give users that encounter the opposite case a way out.

Questions:

@rust-highfive
Copy link
Collaborator

r? @joshtriplett

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 22, 2021
@the8472
Copy link
Member Author

the8472 commented Apr 22, 2021

@rustbot label T-libs

@rustbot rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Apr 22, 2021
@rust-log-analyzer

This comment has been minimized.

@joshtriplett
Copy link
Member

I think the general approach is very reasonable.

As funny as I find "twice_a_day", I think this should use more mundane naming. I also think it needs RUST_ in the name.

I would suggest RUST_CLOCK_ASSUME_MONOTONIC=1 and RUST_CLOCK_ASSUME_MONOTONIC=0, with unset meaning the default automatic behavior.

…s and linux

the environment variable is read on first use and cached in an atomic
@the8472
Copy link
Member Author

the8472 commented Apr 22, 2021

Sure, done.

@cuviper
Copy link
Member

cuviper commented Apr 22, 2021

Are environment controls considered part of the standard library's stable API?

@joshtriplett
Copy link
Member

@the8472 To answer your other question:

should this be documented or remain an undocumented bandaid we can offer to users if they run into reliability/performance problems?

I think we should absolutely document this. I also think we should very explicitly say "if you need to use RUST_CLOCK_ASSUME_MONOTONIC=0, please let us know so we can improve things to work more automatically in the future".

@the8472
Copy link
Member Author

the8472 commented Apr 22, 2021

Ok. We still don't have stability markers for module documentation, so I'll add some "subject to change" note to the prose so it shouldn't be considered insta-stable.

@Mark-Simulacrum
Copy link
Member

I think the environment variable should perhaps be an opt-in to the "maybe buggy" behavior rather than the reverse. Otherwise I would expect it to be common advice to set it - most code won't notice the performance hit of the mutex, but would likely notice the possibility of panics and want to avoid that.

I think I'd feel differently if we expected this to be limited to non-tier-1 targets or similar, but from what I can tell this is likely to indefinitely be a problem for us without a - somewhat costly - solution. Perhaps that cost can be pushed up into the kernel, I'm not sure; unless we know that the function we're calling to get the timestamp is backed by a similar mutex (whether in kernel or other host environment) it seems like we should just make sure of correctness ourselves.

@the8472
Copy link
Member Author

the8472 commented Apr 22, 2021

@rust-log-analyzer

This comment has been minimized.

@nagisa
Copy link
Member

nagisa commented Apr 23, 2021

Transplanting my concerns/comments from Zulip:

I think that people who deploy their software to any broken hardware will be inclined to add something like env::set_var inside their code. I know I would do, because to me my code working across all hardware without having to worry about environment detail is way more important than the throughput of the timing function call. While setting the environment variable from code is likely to work, the caching behaviour is also prone to be a major stability hazard. In particular, adding calls to read time anywhere in code would become a potentially breaking change. Any code of the form:

fn main() {
    didnt_check_time_before() // but now does
    env::set_var("RUST_CLOCK_ASSUME_MONOTONIC", "0");
    Instant::now() // this is supposed to set up the cache.
}

would observe a change in behaviour once didnt_check_time_before starts calling the time functions. This extends to the point where we'd have to stay mindful of this caveat during initialization before main is called. It could also be something as innocuous as a log statement.


EDIT I doubt that saying "this is meant for debugging and is not covered by stability guarantees" will prevent anybody from relying on this environment variable to make their code work on the sandybridges they may come across, to be honest.

@CDirkx
Copy link
Contributor

CDirkx commented Apr 23, 2021

Do we have any data on how big of a performance impact the current workaround is?

@the8472
Copy link
Member Author

the8472 commented Apr 23, 2021

I think the environment variable should perhaps be an opt-in to the "maybe buggy" behavior rather than the reverse. Otherwise I would expect it to be common advice to set it - most code won't notice the performance hit of the mutex, but would likely notice the possibility of panics and want to avoid that.

Down that road lie bad equilibria.

  • it extinguishes any signal we have, on all platforms, about the quality of clocks, making it much more difficult in the future to revert that policy
  • it duplicates efforts on other levels of the system (kernel, hardware, other languages)
  • it imposes a toll on every system to protect against a few broken ones, even as hardware improves
  • it leads to latency spikes (thread holding mutex getting preempted) on what should be a constant-time, low-overhead, contention-free operation

The current approach is a one-way ratchet where we mark some platform as non-monotonic when a user reports an issue and then never undo that because there is no mechanism to gauge the opposite case.

Ideally we would make actually_monotonic() { true } on nightly builds so people would give us more feedback, rather than fewer.

I think I'd feel differently if we expected this to be limited to non-tier-1 targets or similar, but from what I can tell this is likely to indefinitely be a problem for us without a - somewhat costly - solution. Perhaps that cost can be pushed up into the kernel, I'm not sure; unless we know that the function we're calling to get the timestamp is backed by a similar mutex (whether in kernel or other host environment) it seems like we should just make sure of correctness ourselves.

Yes, the API contract of CLOCK_MONOTONIC is that it is monotonic. Windows makes similar claims and yet we treat windows as never monotonic. So we should treat this as an OS or hardware bug instead.

The issue is easy to paper over, and we should have an option to do that. But making it visible should also be an option, we shouldn't guarantee to hide platform bugs forever.

I think that people who deploy their software to any broken hardware will be inclined to add something like env::set_var inside their code. I know I would do, because to me my code working across all hardware without having to worry about environment detail is way more important than the throughput of the timing function call. While setting the environment variable from code is likely to work, the caching behaviour is also prone to be a major stability hazard. In particular, adding calls to read time anywhere in code would become a potentially breaking change. Any code of the form:

fn main() {
    didnt_check_time_before() // but now does
    env::set_var("RUST_CLOCK_ASSUME_MONOTONIC", "0");
    Instant::now() // this is supposed to set up the cache.
}

The problem is with the environment though, so the intent is to give people the option to mark specific environments as unreliable, not to make it part of the code.

EDIT I doubt that saying "this is meant for debugging and is not covered by stability guarantees" will prevent anybody from relying on this environment variable to make their code work on the sandybridges they may come across, to be honest.

Do we have anything like #[cfg(channel = "nightly")]? Stability annotations aren't applicable here since this isn't an API that we want to hide but behavior we would have to toggle.

I don't want to use unstable function because that makes it more difficult for users of cargo installed binaries.

I could also remove the documentation part referring to the environment variable and only leave the request to report bugs and we can then direct users to the variable when they have filed a bug.

@the8472
Copy link
Member Author

the8472 commented Apr 23, 2021

Do we have any data on how big of a performance impact the current workaround is?

That depends on whether you're interested in throughput or worst case latency. It also depends on how many threads you have and how frequently they call Instant::now().

rdtsc itself is perfectly scalable, it is always executes with about the same instruction throughput no matter how many cores are using it. The workaround on the other hand is the opposite of scalable, it is serializing.
It uses a Mutex, which means if the one thread is in the critical section and gets preempted then another thread can get stalled for however long it takes the scheduler to wake up one thread, exit the critical section and then wake up the other thread. On the uncontended case the mutex never leaves userspace and only requires 2 uncontended atomic atomic operations to do its job.

The kernel can do a better workaround than perfectly serializing because rdtsc actually returns a 64bit value - instead of the 128bit timespec - which can be monotonized with a load-max-cmpxchg sequence which can bail out early which allows multiple threads to make progress while anything working with a mutex or 128bit values only allows 1 thread at a time to make progress.

My optimization in #83093 does not change the general picture for unix systems, it only reduces the constant overhead, not the asymptotics. On windows we can do better since it exposes a 64bit counter value.

@workingjubilee workingjubilee added the A-time Area: Time label Apr 25, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 9, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 31, 2021
@nagisa
Copy link
Member

nagisa commented Jun 21, 2021

The problem is with the environment though, so the intent is to give people the option to mark specific environments as unreliable, not to make it part of the code.

Addressing this particular comment now because I recently heard a relevant analogy (VMs are cattle, not pets) from my coworker with regards to cloud instances. Any organization dealing with more than a couple of instances in the cloud will want to ensure instances have no code specific to address particularities of the specific instances their applications run on. For that reason all sorts of details end up in the application code instead. This environment variable would very much too, most likely unconditionally.

I can see the same rationale applying to the applications that run on the user machines.


Do we have anything like #[cfg(channel = "nightly")]?

I don't believe so.

@the8472
Copy link
Member Author

the8472 commented Jun 22, 2021

The problem is with the environment though, so the intent is to give people the option to mark specific environments as unreliable, not to make it part of the code.

Addressing this particular comment now because I recently heard a relevant analogy (VMs are cattle, not pets) from my coworker with regards to cloud instances. Any organization dealing with more than a couple of instances in the cloud will want to ensure instances have no code specific to address particularities of the specific instances their applications run on. For that reason all sorts of details end up in the application code instead. This environment variable would very much too, most likely unconditionally.

I do understand the desire, but I do not want to actively support it because it leads to bad places as already outlined in my previous comment. It's similar to protocol ossification or permanent technical debt that nobody wants to clean up.
It's the result of seeking an easy short-term fix without considering the long term. This is not a fix, but a bandaid and a means to relax the default to allow more systems to benefit from fast timers because those who have broken systems can opt into the workaround. Bandaids have to come off at some point, as people replace their hardware or update firmware or hypervisors. That's why it shouldn't be baked into code.

If anything this makes me think we need a less stable mechanism to make this available. Maybe call it RUST_1_53_CLOCK_ASSUME_MONOTONIC derive the string from the compiler version or edition or something? Then people have to reapply it occasionally, which provides incentives to get things fixed.

@crlf0710 crlf0710 added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 9, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 25, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 15, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 27, 2021
@joshtriplett joshtriplett added needs-fcp This change is insta-stable, so needs a completed FCP to proceed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 4, 2021
@the8472
Copy link
Member Author

the8472 commented Oct 15, 2021

Since controlling this via environment variables seems contentious, here's another approach:

  1. change Instant::{elapsed, duration_since} to saturating
  2. deprecate Instant::saturating_duration_since
  3. remove all the backslide protection code

Instead of panics backslides would now saturate to zero. The downside is that duration_since would no longer catch the programming error where the start and end time of an interval are swapped, the developer either has to sanity-check the output or switch to checked_duration_since to catch it.
Another downside would be that backslides become observable via checked_duration_since, especially on systems that don't have a clock that guarantees monotonicity.

@cuviper
Copy link
Member

cuviper commented Oct 15, 2021

I'm in favor of saturating (#88652 (comment)).

@joshtriplett
Copy link
Member

I would be in favor of saturating by default as well.

@bors
Copy link
Contributor

bors commented Oct 17, 2021

☔ The latest upstream changes (presumably #88652) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 17, 2021
Comment on lines +19 to +30
let new = match env::var("RUST_CLOCK_ASSUME_MONOTONIC").as_deref() {
Ok("1") => InstantReliability::AssumeMonotonic,
Ok("0") => InstantReliability::AssumeBroken,
Ok(_) => {
eprintln!("unsupported value in RUST_CLOCK_ASSUME_MONOTONIC; using default");
InstantReliability::Default
}
_ => InstantReliability::Default,
};
INSTANT_RELIABILITY.store(new as u8, Ordering::Relaxed);
new
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just passing by - should this be a nested #[cold] function with the outer function marked #[inline]? Just to reduce the best-case cost of actually_monotonic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although maybe this isn't relevant at all anymore? If so nevermind

@camelid camelid added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 10, 2021
@the8472
Copy link
Member Author

the8472 commented Jan 7, 2022

Closing in favor of #89926

@the8472 the8472 closed this Jan 7, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 13, 2022
…mulacrum

make `Instant::{duration_since, elapsed, sub}` saturating and remove workarounds

This removes all mutex/atomic-based workarounds for non-monotonic clocks and makes the previously panicking methods saturating instead. Additionally `saturating_duration_since` becomes deprecated since `duration_since` now fills that role.

Effectively this moves the fixup from `Instant` construction to the comparisons.

This has some observable effects, especially on platforms without monotonic clocks:

* Incorrectly ordered Instant comparisons no longer panic in release mode. This could hide some programming errors, but since debug mode still panics tests can still catch them.
* `checked_duration_since` will now return `None` in more cases. Previously it only happened when one compared instants obtained in the wrong order or manually created ones. Now it also does on backslides.
* non-monotonic intervals will not be transitive, i.e. `b.duration_since(a) + c.duration_since(b) != c.duration_since(a)`

The upsides are reduced complexity and lower overhead of `Instant::now`.

## Motivation

Currently we must choose between two poisons. One is high worst-case latency and jitter of `Instant::now()` due to explicit synchronization; see rust-lang#83093 for benchmarks, the worst-case overhead is > 100x. The other is sporadic panics on specific, rare combinations of CPU/hypervisor/operating system due to platform bugs.

Use-cases where low-overhead, fine-grained timestamps are needed - such as syscall tracing, performance profiles or sensor data acquisition (drone flight controllers were mentioned in a libs meeting) in multi-threaded programs - are negatively impacted by the synchronization.

The panics are user-visible (program crashes), hard to reproduce and can be triggered by any dependency that might be using Instants for any reason.

A solution that is fast _and_ doesn't panic is desirable.

----

closes rust-lang#84448
closes rust-lang#86470
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-time Area: Time needs-fcp This change is insta-stable, so needs a completed FCP to proceed. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.