Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add matcher combinators #118

Merged
merged 7 commits into from
Oct 16, 2023
Merged

Add matcher combinators #118

merged 7 commits into from
Oct 16, 2023

Conversation

9999years
Copy link
Member

@9999years 9999years commented Oct 2, 2023

These matcher combinators will let us express more complex logging concepts, like:

  • Match these two events in any order.
  • Match either of these two events.
  • Match either of these two events, as long as this other event doesn't match first.

There's also a system of checkpoints where log events are read into the default checkpoint. At any time, you can create another checkpoint with GhciWatch::checkpoint. Then, log events will be read into the new checkpoint. Later, you can assert that messages were logged in a particular checkpoint or range of checkpoints.

@9999years 9999years mentioned this pull request Oct 2, 2023
@9999years
Copy link
Member Author

9999years commented Oct 2, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@9999years 9999years force-pushed the rebeccat/matcher branch 2 times, most recently from a645084 to 5a79976 Compare October 3, 2023 20:17
test-harness/src/checkpoint.rs Outdated Show resolved Hide resolved
/// Checkpoints can be constructed with [`crate::GhcidNg::first_checkpoint`],
/// [`crate::GhcidNg::current_checkpoint`], and [`crate::GhcidNg::checkpoint`].
#[derive(Debug, Clone, Copy)]
pub struct Checkpoint(pub(crate) usize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally feel like this Checkpoint "newtype" doesn't buy you much over just using usize directly and the Checkpoint type adds a lot of complexity here. Most of the CheckpointIndex impls don't really do anything other than coerce Checkpoint to usize

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bounds-checked index, so it's guaranteed to be safe to use. If we used a usize the user could pass in garbage and cause the test to panic. Probably not that bad, but we do get a safety guarantee out of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually, the thing it does give us is a SliceIndex implementation with slice output for single checkpoints. This lets us treat single checkpoints like 1 and ranges like 1..3 or 2.. the same. The standard SliceIndex implementations instead return a T for a usize and a [T] for ranges of usize.

So we'd need to keep the CheckpointIndex trait around, even if it was just implemented for usize and ranges of usize.

@9999years 9999years force-pushed the rebeccat/matcher branch 4 times, most recently from 749d13b to 9767a25 Compare October 6, 2023 17:30
evanrelf
evanrelf previously approved these changes Oct 16, 2023
Copy link
Member

@evanrelf evanrelf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some thoughts, but nothing sticks out as incorrect or anything, so not blocking a merge.

Comment on lines +332 to +337
// Otherwise, wait for a log message.
match tokio::time::timeout(timeout_duration, async {
loop {
match self.tracing_reader.next_event().await {
Err(err) => {
return Err(err);
}
Ok(event) => {
if matcher.matches(&event) {
return Ok(event);
} else if let Some(negative_matcher) = &negative_matcher {
if negative_matcher.matches(&event) {
return Err(miette!("Found a log event matching {negative_matcher}"));
}
}
}
let event = self.read_event().await?;
if matcher.matches(event)? {
return Ok(event.clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(struggled to select the exact region I want in GitHub, sorry)

This block reads really strangely to me. I don't think the Ok(Ok(event)) or Ok(Err(err)) match arms are reachable. You'll either return early from the whole function, or you'll get Err(_) from the timeout. Otherwise you're just looping infinitely.

I would either:

  1. Change the current match the_block to if the_block.is_err()
  2. break from the loop with the matching event value instead of returning from the function

I don't think your current code is incorrect, it's just weird lol.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a horrible quirk of Rust's async/await compiling down into state machines: an async block like the one used here is like a separate function, so the return on line 337 only returns from the async block passed to tokio::time::timeout.

  • The Ok(Ok(event)) branch is reached from the async block return on line 337.
  • The Ok(Err(_)) branch is reached from the try (?) expressions on lines 335-336.
  • The Err(_) branch is reached if the timeout expires.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense! I knew about the async/await state machine stuff, but I didn't realize return was "scoped" to an async block. TIL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of evil because async is the only block scoped this way. There's been a proposal for try {} blocks that behave similarly since at least 2016 but there's some issues with type inference that have kept it from getting merged.

test-harness/src/ghciwatch.rs Outdated Show resolved Hide resolved
Comment on lines +64 to +70
assert!(!matcher.matches(&event).unwrap());
assert!(matcher
.matches(&Event {
message: "doggy".to_owned(),
..event
})
.unwrap());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels strange to me that matches is mutating the matcher. I feel like consuming events and mutating state should be separate from checking whether the matcher is satisfied yet.

Something like this (dumb names aside):

matcher.consume(&event).unwrap();
assert!(!matcher.is_satisfied());

matcher.consume(&Event { message: "doggy".to_owned(), ..event }).unwrap();
assert!(matcher.is_satisfied());

Where consume takes a &mut self and is_satisfied takes a &self.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. My thought process is that the only thing that can update whether the matcher is satisfied is feeding it an event. Splitting it up like this would also mean storing some extra state, like FusedMatcher does.

This is a parallel to the Iterator::next method:

Returns None when iteration is finished. Individual iterator implementations may choose to resume iteration, and so calling next() again may or may not eventually start returning Some(Item) again at some point.

To compensate for this, there's a FusedIterator trait:

An iterator that always continues to yield None when exhausted.

Calling next on a fused iterator that has returned None once is guaranteed to return None again. This trait should be implemented by all iterators that behave this way because it allows optimizing Iterator::fuse().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm thinking of matchers like parsers in Haskell, where given some input, I get back a parse result and maybe some leftovers or whatever.

More of a pure function from some input to a result. And then the idea of incremental input consumption would just be an optimization or ergonomic convenience, where the thing remembers things you've given it before because it has internal mutable state, so you don't need to give it everything at once.

But I'm not that familiar with parsers in imperative languages, and I'm struggling to see the connection to Rust's iterators here (mayyybe it feels kinda like a peekable iterator?), so I might just be totally off base here 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to see the connection to Rust's iterators here (mayyybe it feels kinda like a peekable iterator?)

You're correct, Peekable is similar, except instead of storing if the iterator has ever returned None, it stores the iterator's next item.

For some Matchers (like the BaseMatcher), the matching is stateless and pure; if we wanted to have separate consume and is_satisfied methods, BaseMatcher would need to store an is_satisfied: boolean field to remember if it has already matched. FusedMatcher behaves like this, storing an extra matched: boolean field and returning Ok(true) from Matcher::matches unconditionally if matched is true.

Similarly, Iterator's next() method mutates the iterator's state to (possibly) provide a next element. Because Iterator doesn't have an is_finished method, returning None from next() once doesn't mean it will never return Some again (e.g. for an Iterator reading lines from a file, while some other process writes data to the file). So std provides a FusedIterator that stores an extra bit of data: has the underlying iterator returned None yet? If it has, Iterator::next always returns None, and otherwise it calls the underlying method.

It's the same principle, adding an extra bit of state to make another guarantee on the API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying reason for Peekable is the same, too -- sometimes it's easy to check if there's a next element in advance, sometimes it's not, but it's always easy to take an unpeekable iterator, call next on it, and memoize the return value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, I see how it's similar to Iterator now, thanks for explaining.

Comment on lines +8 to +11
/// A matcher which may or may not contain a matcher.
///
/// If it does not contain a matcher, it never matches.
pub struct OptionMatcher<M>(Option<M>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you instead impl<M: Matcher> Matcher for Option<M> or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without writing my own Display replacement/wrapper trait at least:

error[E0277]: `std::option::Option<M>` doesn't implement `std::fmt::Display`
  --> test-harness/src/matcher/option_matcher.rs:13:30
   |
13 | impl<M: Matcher> Matcher for Option<M> {}
   |                              ^^^^^^^^^ `std::option::Option<M>` cannot be formatted with the default formatter
   |
   = help: the trait `std::fmt::Display` is not implemented for `std::option::Option<M>`
   = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
note: required by a bound in `Matcher`
  --> test-harness/src/matcher/mod.rs:30:20
   |
30 | pub trait Matcher: Display {
   |                    ^^^^^^^ required by this bound in `Matcher`

@9999years 9999years merged commit 8cc6a70 into main Oct 16, 2023
28 checks passed
@9999years 9999years deleted the rebeccat/matcher branch October 16, 2023 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants