Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpyMetricSink and Mutex poisoning #124

Closed
parasyte opened this issue Mar 12, 2021 · 5 comments · Fixed by #128
Closed

SpyMetricSink and Mutex poisoning #124

parasyte opened this issue Mar 12, 2021 · 5 comments · Fixed by #128

Comments

@parasyte
Copy link

We ran into a situation where running unit tests with SpyMetricSink/BufferedSpyMetricsSink would poison the mutex when assertions failed during tests. This leads to spooky action at a distance with these types unwrapping the mutex lock result and panicking in a different code path.

Using parking_lot::Mutex can avoid poisoning.

@56quarters
Copy link
Owner

56quarters commented Mar 12, 2021

Oof, yeah I ran into that when trying to write tests for the sinks themselves. Thanks for reporting it. Since the Mutex is part of the public API of the sinks I'd like to try to avoid changing it or adding a dependency on another crate. I'll spend some time thinking about this.

Do you have a minimal test case to reproduce this? I'm wondering if this can be worked around with some copying of the contents of the buffer and extra documentation to avoid holding the lock while any assertions are made.

@parasyte
Copy link
Author

Yes, it can also be avoided with copying. One of the things I wanted to do initially was to avoid copying, and the mutex should (in theory) provide that capability. Another design could use a channel (copying).

We can always change the code on our side that holds a MutexGuard to return a boolean instead of asserting while the lock is held. So we are not blocked or anything. This was more to raise awareness and begin a discussion. Panics can be tricky, since they are not always immediately obvious and can happen at any stack depth.

I'm sure I could come up with a minimal test case for this...

@56quarters
Copy link
Owner

56quarters commented Mar 12, 2021

Another design could use a channel (copying)

Ooh, that's a neat idea for changing the sink. It would be a breaking change but it seems like a much cleaner API and it can use crossbeam_channel which is already a dependency.

Roughly something like:

let (rx, sink) = SpyMetricSink::new();
let client = StatsdClient::from_sink("prefix", sink);
// etc...

let metric1 = rx.recv().unwrap();
let metric2 = rx.recv().unwrap();

assert_eq!("prefix.my.metric:4|c", metric1);
assert_eq!("prefix.other.metric:42|c", metric2);

WDYT?

@parasyte
Copy link
Author

It needs a contingency plan for back pressure (what happens when the reader isn't consuming messages fast enough?) but yeah, channels are a great way to separate concerns like this.

This is just a FWIW, but the tokio::sync::broadcast channel handles back pressure by dropping messages and the receiver gets an Error value indicating that fact. I don't know what the best design is, here. But UDP is already lossy and channel pressure would be another way to lose metrics without blocking the critical path. (If that makes sense.)

@56quarters
Copy link
Owner

56quarters commented Mar 12, 2021

👍 The channels from crossbeam_channel allow you to set a max size after which writes will return an error. Easy enough to let callers set the max size of the buffer used by the channel (same as QueueingMetricSink does).

56quarters added a commit that referenced this issue Mar 16, 2021
Rewrite `SpyMetricSink` and `BufferedSpyMetricSink` to use channels to
share metrics emited with callers instead of a shared `Write` impl
protected by a `Mutex`. This avoids poisoning the mutex if a test
assertion fails while the mutex is held (which is common for something
meant to be used for testing).

Examples of how tests can be written against the new spy sinks can
be found in the cadence/examples/spy-sink.rs example as well as the
integration tests for cadence-macros.

Fixes #124
56quarters added a commit that referenced this issue Mar 17, 2021
Rewrite `SpyMetricSink` and `BufferedSpyMetricSink` to use channels to
share metrics emited with callers instead of a shared `Write` impl
protected by a `Mutex`. This avoids poisoning the mutex if a test
assertion fails while the mutex is held (which is common for something
meant to be used for testing).

Examples of how tests can be written against the new spy sinks can
be found in the cadence/examples/spy-sink.rs example as well as the
integration tests for cadence-macros.

Fixes #124
56quarters added a commit that referenced this issue Mar 17, 2021
Rewrite `SpyMetricSink` and `BufferedSpyMetricSink` to use channels to
share metrics emited with callers instead of a shared `Write` impl
protected by a `Mutex`. This avoids poisoning the mutex if a test
assertion fails while the mutex is held (which is common for something
meant to be used for testing).

Examples of how tests can be written against the new spy sinks can
be found in the cadence/examples/spy-sink.rs example as well as the
integration tests for cadence-macros.

Fixes #124
56quarters added a commit that referenced this issue Mar 17, 2021
Rewrite `SpyMetricSink` and `BufferedSpyMetricSink` to use channels to
share metrics emitted with callers instead of a shared `Write` impl
protected by a `Mutex`. This avoids poisoning the mutex if a test
assertion fails while the mutex is held (which is common for something
meant to be used for testing).

Examples of how tests can be written against the new spy sinks can
be found in the cadence/examples/spy-sink.rs example as well as the
integration tests for cadence-macros.

Fixes #124
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants