Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an interest cache for logs emitted through the log crate #1636

Merged
merged 6 commits into from
Oct 18, 2021

Conversation

koute
Copy link

@koute koute commented Oct 13, 2021

Motivation

We use tracing as our logger in substrate. We've noticed that as soon as any trace log is enabled (even one which doesn't exists) the whole logging machinery starts to take a lot of time, even if nothing at all is actually printed!

In one of our quick-and-dirty reproduction benchmarks (JIT-ing a WASM program) we saw the total real runtime rise from around ~1.3s to ~7s just by adding a trace log filter which doesn't match anything. (Depending on the hardware and on how many threads are simultaneously logging this figure varies pretty widely, but it's always a very significant drop.)

After looking into this problem I've found that the culprit of the slowdown were trace! and debug! logs sprinkled quite liberally in some of the more hot codepaths. When there are no trace-level filters defined on the logger it can basically reject those inert trace! and debug! logs purely based on the current maximum logging level (which is cheap!), but as soon as you define any trace filter the current maximum logging changes, and then every trace! and debug! log has to go through the whole filtering machinery before it can be rejected. While this is cheap if you only do it once, it starts to become very expensive when you do it a lot, especially when you're running multiple threads and enable log reloading. (This is related to #1632.)

Solution

I've added an opt-in per-thread LRU cache which tries to cache whenever the logger is actually interested in a given target + level pair for every log emitted through the log crate.

I've also added a benchmark very roughly replicating the situation from our code; here's the performance without the cache: (cargo bench)

[838.67 ns 846.51 ns 854.04 ns]

And here's the performance with the cache: (cargo bench --features interest-cache)

[25.322 ns 25.556 ns 25.820 ns]

As you can see the per-call cost was cut down to less than ~3%.

@koute koute requested review from hawkw and a team as code owners October 13, 2021 13:09
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you for working on it --- caching interests for log callsites is something I've wanted to be able to do for a while, but haven't had the time to work on.

I'd like to give this PR a more thorough reading soon. However, the first thing I noticed is that I believe we should be able to implement this without having to make a change in the tracing-core crate. If it's at all possible to avoid the tracing-core change, this would be preferable. I left a comment describing how I think we can do this.

Comment on lines 136 to 139
#[doc(hidden)]
pub fn _interest_cache_epoch() -> usize {
EPOCH.load(Ordering::SeqCst)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of adding this in tracing-core, since it's a new doc(hidden) API that's used only by tracing-log.

I think there's a way to avoid having to change tracing-core, and detect the number of times the interest cache has been reloaded purely in tracing-log. We can do this by adding a new dummy type implementing the tracing_core::Callsite trait, and registering it with the subscriber the first time a log event is recorded. Every time the interest cache is rebuilt, the dummy callsite's [Callsite::set_interest] method will be called again. In that method, we can increment the epoch counter.

This should allow us to determine the interest cache's current epoch solely in the tracing-log crate, without adding doc(hidden) code in tracing-core.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Please let me know if there's anything else you'd like me to change.

tracing-log/src/interest_cache.rs Show resolved Hide resolved
Comment on lines 180 to 185
#[cfg(feature = "interest-cache")]
mod interest_cache;

#[cfg(feature = "interest-cache")]
#[cfg_attr(docsrs, doc(cfg(feature = "interest-cache")))]
pub use crate::interest_cache::InterestCacheConfig;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the interest cache code also uses std APIs that won't work with #![no_std] (such as thread_local! and Mutex), it should probably also require the std feature flag to be enabled. Otherwise, I believe the code just won't compile:

Suggested change
#[cfg(feature = "interest-cache")]
mod interest_cache;
#[cfg(feature = "interest-cache")]
#[cfg_attr(docsrs, doc(cfg(feature = "interest-cache")))]
pub use crate::interest_cache::InterestCacheConfig;
#[cfg(all(feature = "interest-cache", feature = "std"))]
mod interest_cache;
#[cfg(all(feature = "interest-cache", feature = "std"))]
#[cfg_attr(docsrs, doc(cfg(all(feature = "interest-cache", feature = "std"))))]
pub use crate::interest_cache::InterestCacheConfig;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... this is a fair point, however please note that the crate currently doesn't compile on no_std targets anyway due to the dependencies:

$ cargo check --target=aarch64-unknown-none
    Checking lazy_static v1.4.0
    Checking cfg-if v1.0.0
   Compiling log v0.4.14
error[E0463]: can't find crate for `std`
 --> /home/kou/.cargo/registry/src/github.com-1ecc6299db9ec823/lazy_static-1.4.0/src/inline_lazy.rs:9:1
  |
9 | extern crate std;
  | ^^^^^^^^^^^^^^^^^ can't find crate
  |
  = note: the `aarch64-unknown-none` target may not support the standard library
  = help: consider building the standard library from source with `cargo build -Zbuild-std`

error: aborting due to previous error

(And yes, lazy_static does technically support no_std, but it needs an extra feature to be enabled, so it works unlike everything else in the ecosystem where the no_std needs an extra [usually default] feature to be disabled. So for this to compile the user would have to disable default features in tracing-log and explicitly include lazy_static in their deps and enable the lazy_static's spin_no_std feature even if they don't use lazy_static themselves.)

And since the interest-cache also pulls in extra dependencies (ahash and lru) even if we'll do a #[cfg(all(feature = "interest-cache", feature = "std"))] it will still not compile. Ideally we'd also only pull in those extra deps when both std and interest-cache are enabled, but unfortunately cargo doesn't directly support that.

So I guess this is a good idea as far as documenting that it requires std (through the cfg_attr), but from practical point of view the compilation will still fail on a dependency if only interest-cache will be enabled without the std.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is out of scope of this PR, but just a side note: maybe it'd be a good idea to remove the lazy_static dependency altogether in lib.rs. A simple atomics-based spinlock is short and easy to write, and could be used to replace it. Or we could wait for a few Rust releases (as panicking in const fns was recently stabilized on nightly) and make the *_FIELDS all const initialized. Or add a private API to tracing-core to const initialize them now (although since that's not very elegant you'd probably not want to do that?).

Copy link
Member

@hawkw hawkw Oct 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... this is a fair point, however please note that the crate currently doesn't compile on no_std targets anyway due to the dependencies:

hmm, are you compiling tracing-log with default-features = false? the log crate's std feature is enabled with default features enabled?

oh, never mind, i misread your comment. yeah, i think that we should probably just add the cfg_attr attribute to document that the interest-cache feature also requires std, but we don't need to actually add the feature flag requirements.

tracing-log/src/log_tracer.rs Outdated Show resolved Hide resolved
tracing-log/src/log_tracer.rs Outdated Show resolved Hide resolved
tracing-log/src/log_tracer.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
@koute
Copy link
Author

koute commented Oct 15, 2021

@hawkw Thank you for the review!

I think I've aligned the code to all of your comments. I've also tweaked how we cache the interest to make it less likely for there to be collisions (basically to make it even less likely that the cache will be spuriously updated).

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this looks good to me --- the additional collision resistance seems good and makes me a bit more confident that we'll probably not be caching wrong results very often.

I had a few minor documentation suggestions, and some last thoughts --- let me know what you think? But, overall, this looks about ready to merge.

Thanks @koute for all your hard work on this PR!

Comment on lines +31 to +32
/// It should be set to the lowest verbosity level for which the majority
/// of the logs in your application are usually *disabled*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be worth noting explicitly that this should not be the max level (as passed to log::set_max_level or returned by tracing::Subscriber::max_level_hint), since if it is set to the max enabled level, the cache will never be used...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I'm not exactly sure how to best phrase this.... but, if we say that "this should not be equal or higher than the max level" or something along those lines then it seems like it suggests to the user that they should change this value based on the the maximum level, but what would be the point of them doing that? It should be completely harmless even if the cache's minimum verbosity is equal or higher than the global max level - yes, the cache never gets used due to the high verbosity logs being filtered out beforehand, but that seems pretty natural to me.

Considering how it works I think the value given here shouldn't be tweaked at runtime, but essentially hardcoded based on the application's logging patterns and the default maximum level, which is why it's set to Debug by default since usually only Info logs are printed out-of-box, and usually there are a lot of debug! and trace! logs since people are a lot more liberal with them precisely because they're disabled by default.

tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Outdated Show resolved Hide resolved
tracing-log/src/interest_cache.rs Show resolved Hide resolved
tracing-log/src/interest_cache.rs Show resolved Hide resolved
Comment on lines +222 to +224
// Realistically we should never land here, unless someone is using a non-static
// target string with the same length and level, or is very lucky and found a hash
// collision for the cache's key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this maybe be a debug assertion? we definitely don't want to panic in release mode, but we could maybe have

Suggested change
// Realistically we should never land here, unless someone is using a non-static
// target string with the same length and level, or is very lucky and found a hash
// collision for the cache's key.
// Realistically we should never land here, unless someone is using a non-static
// target string with the same length and level, or is very lucky and found a hash
// collision for the cache's key.
debug_assert!(
false,
"an interest cache key collision was detected! if you are using non-`'static` \
target strings for `log` records, interest caching should probably not be used."
);

or something like that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "realistically we should never land here" I mostly meant that in normal circumstances with normal usage patterns this should never happen, but in theory it can happen (and the test which I've added in the last commit does in fact trigger this case) and it is not an error; at most this will just result in extra cache churn and call the filter again. So I don't think adding an assertion here would be correct, since this isn't actually an invariant which has to be true.

We could put out a warning somehow, but I don't think that's worth it due to how rare this should be in practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, that makes sense to me. thanks!

/// [target]: log::Metadata::target
#[cfg(all(feature = "interest-cache", feature = "std"))]
#[cfg_attr(docsrs, doc(cfg(all(feature = "interest-cache", feature = "std"))))]
pub fn with_interest_cache(mut self, config: Option<crate::InterestCacheConfig>) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somewhat on the fence over whether it's necessary for this to take an Option to also allow it to disable interest caching...it seems pretty unlikely for anyone to take a builder with interest caching enabled and need to explicitly disable it, because it's off by default. so, if it's ever enabled, it's because the user called this method previously.

so, this API might be slightly more ergonomic if we just took an InterestCacheConfig so the user didn't have to wrap it in an Option...but, on the other hand, having a with_interest_cache and a separate without_interest_cache method for disabling it seems worse. so, maybe this is the best approach!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... you're definitely right that it's unlikely someone would call it with None since it's disabled by default anyway, so this only complicates the API surface for not very much gain.

How about we remove the Option and just treat with_lru_cache_size(0) as it being disabled? That seems pretty natural to me since logically setting the size of the cache to zero should disable it. (This might be a little worse for discoverability than an Option, but we can always document it. Seems like a worthwhile tradeoff.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just treat with_lru_cache_size(0) as it being disabled

Why not remove the possibility to disable it at all? If it is disabled by default, why should you be able to pass something that just means "disable" again?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just treat with_lru_cache_size(0) as it being disabled

Why not remove the possibility to disable it at all? If it is disabled by default, why should you be able to pass something that just means "disable" again?

Yeah, that's essentially what I'm suggesting by just officially treating the with_lru_cache_size(0) as being "disabled", since that's the natural consequence of configuring a cache which is zero sized (that is - a cache which can't hold anything can't actually cache anything, so it's essentially disabled).

Comment on lines 181 to 185
#[cfg(all(feature = "interest-cache", feature = "std"))]
mod interest_cache;

#[cfg(all(feature = "interest-cache", feature = "std"))]
#[cfg_attr(docsrs, doc(cfg(all(feature = "interest-cache", feature = "std"))))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it occurs to me just now that this should probably also require the log-tracer feature, since it's only ever used by the LogTracer type?

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, this looks great to me, let's move forwards with it. we can always improve the docs more in follow-ups!

Comment on lines +32 to +34
}

impl InterestCacheConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: it would be fine to put these in a single impl block:

Suggested change
}
impl InterestCacheConfig {

@hawkw hawkw merged commit 4e5f0f0 into tokio-rs:v0.1.x Oct 18, 2021
hawkw added a commit that referenced this pull request Apr 21, 2022
# 0.1.3 (April 21st, 2022)

### Added

- **log-tracer**: Added `LogTracer::with_interest_cache` to enable a
 limited form of per-record `Interest` caching for `log` records
 ([#1636])

### Changed

- Updated minimum supported Rust version (MSRV) to Rust 1.49.0 ([#1913])

### Fixed

- **log-tracer**: Fixed `LogTracer` not honoring `tracing` max level
  filters ([#1543])
- Broken links in documentation ([#2068], [#2077])

Thanks to @Millione, @teozkr, @koute, @Folyd, and @ben0x539 for
contributing to this release!

[#1636]: https://github.com/tokio-rs/tracing/pulls/1636
[#1913]: https://github.com/tokio-rs/tracing/pulls/1913
[#1543]: https://github.com/tokio-rs/tracing/pulls/1543
[#2068]: https://github.com/tokio-rs/tracing/pulls/2068
[#2077]: https://github.com/tokio-rs/tracing/pulls/2077
hawkw added a commit that referenced this pull request Apr 21, 2022
# 0.1.3 (April 21st, 2022)

### Added

- **log-tracer**: Added `LogTracer::with_interest_cache` to enable a
 limited form of per-record `Interest` caching for `log` records
 ([#1636])

### Changed

- Updated minimum supported Rust version (MSRV) to Rust 1.49.0 ([#1913])

### Fixed

- **log-tracer**: Fixed `LogTracer` not honoring `tracing` max level
  filters ([#1543])
- Broken links in documentation ([#2068], [#2077])

Thanks to @Millione, @teozkr, @koute, @Folyd, and @ben0x539 for
contributing to this release!

[#1636]: https://github.com/tokio-rs/tracing/pulls/1636
[#1913]: https://github.com/tokio-rs/tracing/pulls/1913
[#1543]: https://github.com/tokio-rs/tracing/pulls/1543
[#2068]: https://github.com/tokio-rs/tracing/pulls/2068
[#2077]: https://github.com/tokio-rs/tracing/pulls/2077

Fixes #1884
Fixes #1664
kaffarell pushed a commit to kaffarell/tracing that referenced this pull request May 22, 2024
…okio-rs#1636)

## Motivation

We use `tracing` as our logger in
[`substrate`](https://github.com/paritytech/substrate). We've noticed
that as soon as *any* `trace` log is enabled (even one which doesn't
exists) the whole logging machinery starts to take a lot of time, even
if nothing at all is actually printed!

In one of our quick-and-dirty reproduction benchmarks (JIT-ing a WASM
program) we saw the total real runtime rise from around ~1.3s to ~7s
just by adding a `trace` log filter which doesn't match anything.
(Depending on the hardware and on how many threads are simultaneously
logging this figure varies pretty widely, but it's always a very
significant drop.)

After looking into this problem I've found that the culprit of the
slowdown were `trace!` and `debug!` logs sprinkled quite liberally in
some of the more hot codepaths. When there are no `trace`-level filters
defined on the logger it can basically reject those inert `trace!` and
`debug!` logs purely based on the current maximum logging level (which
is cheap!), but as soon as you define *any* trace filter the current
maximum logging changes, and then every `trace!` and `debug!` log has to
go through the whole filtering machinery before it can be rejected.
While this is cheap if you only do it once, it starts to become very
expensive when you do it a lot, especially when you're running multiple
threads and enable log reloading. (This is related to
tokio-rs#1632.)

## Solution

I've added an opt-in per-thread LRU cache which tries to cache whenever
the logger is actually interested in a given `target` + `level` pair for
every log emitted through the `log` crate.

I've also added a benchmark very roughly replicating the situation from
our code; here's the performance *without* the cache: (`cargo bench`)

```
[838.67 ns 846.51 ns 854.04 ns]
```

And here's the performance *with* the cache: (`cargo bench --features
interest-cache`)

```
[25.322 ns 25.556 ns 25.820 ns]
```

As you can see the per-call cost was cut down to less than ~3%.
kaffarell pushed a commit to kaffarell/tracing that referenced this pull request May 22, 2024
# 0.1.3 (April 21st, 2022)

### Added

- **log-tracer**: Added `LogTracer::with_interest_cache` to enable a
 limited form of per-record `Interest` caching for `log` records
 ([tokio-rs#1636])

### Changed

- Updated minimum supported Rust version (MSRV) to Rust 1.49.0 ([tokio-rs#1913])

### Fixed

- **log-tracer**: Fixed `LogTracer` not honoring `tracing` max level
  filters ([tokio-rs#1543])
- Broken links in documentation ([tokio-rs#2068], [tokio-rs#2077])

Thanks to @Millione, @teozkr, @koute, @Folyd, and @ben0x539 for
contributing to this release!

[tokio-rs#1636]: https://github.com/tokio-rs/tracing/pulls/1636
[tokio-rs#1913]: https://github.com/tokio-rs/tracing/pulls/1913
[tokio-rs#1543]: https://github.com/tokio-rs/tracing/pulls/1543
[tokio-rs#2068]: https://github.com/tokio-rs/tracing/pulls/2068
[tokio-rs#2077]: https://github.com/tokio-rs/tracing/pulls/2077

Fixes tokio-rs#1884
Fixes tokio-rs#1664
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants