SpyMetricSink and Mutex poisoning #124

parasyte · 2021-03-12T01:13:50Z

We ran into a situation where running unit tests with SpyMetricSink/BufferedSpyMetricsSink would poison the mutex when assertions failed during tests. This leads to spooky action at a distance with these types unwrapping the mutex lock result and panicking in a different code path.

Using parking_lot::Mutex can avoid poisoning.

The text was updated successfully, but these errors were encountered:

56quarters · 2021-03-12T02:52:20Z

Oof, yeah I ran into that when trying to write tests for the sinks themselves. Thanks for reporting it. Since the Mutex is part of the public API of the sinks I'd like to try to avoid changing it or adding a dependency on another crate. I'll spend some time thinking about this.

Do you have a minimal test case to reproduce this? I'm wondering if this can be worked around with some copying of the contents of the buffer and extra documentation to avoid holding the lock while any assertions are made.

parasyte · 2021-03-12T03:25:11Z

Yes, it can also be avoided with copying. One of the things I wanted to do initially was to avoid copying, and the mutex should (in theory) provide that capability. Another design could use a channel (copying).

We can always change the code on our side that holds a MutexGuard to return a boolean instead of asserting while the lock is held. So we are not blocked or anything. This was more to raise awareness and begin a discussion. Panics can be tricky, since they are not always immediately obvious and can happen at any stack depth.

I'm sure I could come up with a minimal test case for this...

56quarters · 2021-03-12T19:44:21Z

Another design could use a channel (copying)

Ooh, that's a neat idea for changing the sink. It would be a breaking change but it seems like a much cleaner API and it can use crossbeam_channel which is already a dependency.

Roughly something like:

let (rx, sink) = SpyMetricSink::new();
let client = StatsdClient::from_sink("prefix", sink);
// etc...

let metric1 = rx.recv().unwrap();
let metric2 = rx.recv().unwrap();

assert_eq!("prefix.my.metric:4|c", metric1);
assert_eq!("prefix.other.metric:42|c", metric2);

WDYT?

parasyte · 2021-03-12T22:23:39Z

It needs a contingency plan for back pressure (what happens when the reader isn't consuming messages fast enough?) but yeah, channels are a great way to separate concerns like this.

This is just a FWIW, but the tokio::sync::broadcast channel handles back pressure by dropping messages and the receiver gets an Error value indicating that fact. I don't know what the best design is, here. But UDP is already lossy and channel pressure would be another way to lose metrics without blocking the critical path. (If that makes sense.)

56quarters · 2021-03-12T22:54:23Z

👍 The channels from crossbeam_channel allow you to set a max size after which writes will return an error. Easy enough to let callers set the max size of the buffer used by the channel (same as QueueingMetricSink does).

Rewrite `SpyMetricSink` and `BufferedSpyMetricSink` to use channels to share metrics emited with callers instead of a shared `Write` impl protected by a `Mutex`. This avoids poisoning the mutex if a test assertion fails while the mutex is held (which is common for something meant to be used for testing). Examples of how tests can be written against the new spy sinks can be found in the cadence/examples/spy-sink.rs example as well as the integration tests for cadence-macros. Fixes #124

Rewrite `SpyMetricSink` and `BufferedSpyMetricSink` to use channels to share metrics emitted with callers instead of a shared `Write` impl protected by a `Mutex`. This avoids poisoning the mutex if a test assertion fails while the mutex is held (which is common for something meant to be used for testing). Examples of how tests can be written against the new spy sinks can be found in the cadence/examples/spy-sink.rs example as well as the integration tests for cadence-macros. Fixes #124

56quarters mentioned this issue Mar 17, 2021

Rewrite spy metric sinks to use channels instead of shared writers #128

Merged

56quarters closed this as completed in #128 Mar 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SpyMetricSink and Mutex poisoning #124

SpyMetricSink and Mutex poisoning #124

parasyte commented Mar 12, 2021

56quarters commented Mar 12, 2021 •

edited

Loading

parasyte commented Mar 12, 2021

56quarters commented Mar 12, 2021 •

edited

Loading

parasyte commented Mar 12, 2021

56quarters commented Mar 12, 2021 •

edited

Loading

SpyMetricSink and Mutex poisoning #124

SpyMetricSink and Mutex poisoning #124

Comments

parasyte commented Mar 12, 2021

56quarters commented Mar 12, 2021 • edited Loading

parasyte commented Mar 12, 2021

56quarters commented Mar 12, 2021 • edited Loading

parasyte commented Mar 12, 2021

56quarters commented Mar 12, 2021 • edited Loading

56quarters commented Mar 12, 2021 •

edited

Loading

56quarters commented Mar 12, 2021 •

edited

Loading

56quarters commented Mar 12, 2021 •

edited

Loading