Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing fix #128

Merged
merged 11 commits into from
Jan 31, 2024
12 changes: 12 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,17 @@ pub struct Config {
/// may miss bugs
/// 2. [`lazy_static` values are dropped](mod@crate::lazy_static) at the end of an execution
pub silence_warnings: bool,

/// Whether to call the `Span::record()` method to update the step count (`i`) of the `Span`
sarsko marked this conversation as resolved.
Show resolved Hide resolved
/// containing the `TaskId` and the current step count for the given `TaskId`.
/// The reason this is a config option is that the most popular tracing `Subscriber`s, ie
/// `tracing_subscriber::fmt`, appends to the span on calls to `record()` (instead of
/// overwriting), which results in traces which are hard to read if the task is scheduled more
/// than a few times.
/// Thus: set `record_steps_in_span` to `true` if you want this behaviour, or if you are using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this behavior you mean append behavior right? Could you make that clear?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: do we really need to provide this option? What other format subscribers are there for which we'd want to set this to true?

Copy link
Contributor Author

@sarsko sarsko Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this behavior you mean append behavior right? Could you make that clear?

By this_behaviour I mean that we call record() on the step_span. In tracing_subscriber::fmt this maps to appends. In The Nice Subscriber this maps to overwrites.

Also: do we really need to provide this option? What other format subscribers are there for which we'd want to set this to true?

Having the step logged is super useful. I think this will be set to true for all my tests. I'll make a tracing_subscriber::fmt::Layer which handles this.

/// a `Subscriber` which overwrites on calls to `record()` and want to display the current step
/// count.
pub record_steps_in_span: bool,
}

impl Config {
Expand All @@ -231,6 +242,7 @@ impl Config {
max_steps: MaxSteps::FailAfter(1_000_000),
max_time: None,
silence_warnings: false,
record_steps_in_span: false,
}
}
}
Expand Down
91 changes: 62 additions & 29 deletions src/runtime/execution.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use std::future::Future;
use std::panic;
use std::rc::Rc;
use std::sync::Arc;
use tracing::trace;
use tracing::{trace, Span};

use super::task::Tag;

Expand Down Expand Up @@ -144,7 +144,42 @@ impl Execution {
// Run a single step of the chosen task.
let ret = match next_step {
NextStep::Task(continuation) => {
panic::catch_unwind(panic::AssertUnwindSafe(|| continuation.borrow_mut().resume()))
// Enter the Task's span
ExecutionState::with(|state| {
tracing::dispatcher::get_default(|subscriber| {
let current_span_id = tracing::Span::current().id();
if let Some(span_id) = current_span_id.as_ref() {
subscriber.exit(span_id);
}

if let Some(span_id) = state.current().span.id().as_ref() {
subscriber.enter(span_id)
}

if state.config.record_steps_in_span {
state.current().step_span.record("i", state.current_schedule.len());
}
});
});

let result = panic::catch_unwind(panic::AssertUnwindSafe(|| continuation.borrow_mut().resume()));

// Leave the Task's span and store which span it exited in order to restore it the next time the Task is run
ExecutionState::with(|state| {
tracing::dispatcher::get_default(|subscriber| {
let current_span = tracing::Span::current();
if let Some(span_id) = current_span.id().as_ref() {
subscriber.exit(span_id);
}
state.current_mut().span = current_span;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me wonder if the field should be renamed from span to last_span ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?
Not super important what it is named, but why?
The span hasn't changed, we have just swapped out of the task. The span is the Span of the task, just as the continuation is its continuation.


if let Some(span_id) = state.span.id().as_ref() {
sarsko marked this conversation as resolved.
Show resolved Hide resolved
subscriber.enter(span_id)
}
});
});

result
}
NextStep::Failure(msg, schedule) => {
// Because we're creating the panic here, we don't need `persist_failure` to print
Expand All @@ -171,6 +206,7 @@ impl Execution {
Ok(panic_msg) => Box::new(format!("{}\noriginal panic: {}", message, panic_msg)),
Err(panic) => panic,
};

panic::resume_unwind(payload);
}
}
Expand Down Expand Up @@ -203,6 +239,9 @@ pub(crate) struct ExecutionState {

#[cfg(debug_assertions)]
has_cleaned_up: bool,

// The `Span` which the `ExecutionState` was created under. Will be the parent of all `Task` `Span`s
pub(crate) span: Span,
}

#[derive(Debug, PartialEq, Eq, Clone, Copy)]
Expand Down Expand Up @@ -240,6 +279,7 @@ impl ExecutionState {
current_schedule: initial_schedule,
#[cfg(debug_assertions)]
has_cleaned_up: false,
span: tracing::Span::current(),
sarsko marked this conversation as resolved.
Show resolved Hide resolved
}
}

Expand Down Expand Up @@ -287,11 +327,8 @@ impl ExecutionState {
{
Self::with(|state| {
let schedule_len = state.current_schedule.len();
let parent_id = state.span.id();
sarsko marked this conversation as resolved.
Show resolved Hide resolved

let parent_span = state
.try_current()
.map(|t| t.span.clone())
.unwrap_or_else(tracing::Span::current);
let task_id = TaskId(state.tasks.len());
let tag = state.get_tag_or_default_for_current_task();
let clock = state.increment_clock_mut(); // Increment the parent's clock
Expand All @@ -303,9 +340,10 @@ impl ExecutionState {
task_id,
name,
clock.clone(),
parent_span,
parent_id,
schedule_len,
tag,
state.try_current().map(|t| t.id()),
);

state.tasks.push(task);
Expand All @@ -323,6 +361,7 @@ impl ExecutionState {
F: FnOnce() + Send + 'static,
{
Self::with(|state| {
let parent_id = state.span.id();
let task_id = TaskId(state.tasks.len());
let tag = state.get_tag_or_default_for_current_task();
let clock = if let Some(ref mut clock) = initial_clock {
Expand All @@ -336,11 +375,17 @@ impl ExecutionState {

let schedule_len = state.current_schedule.len();

let parent_span = state
.try_current()
.map(|t| t.span.clone())
.unwrap_or_else(tracing::Span::current);
let task = Task::from_closure(f, stack_size, task_id, name, clock, parent_span, schedule_len, tag);
let task = Task::from_closure(
f,
stack_size,
task_id,
name,
clock,
parent_id,
schedule_len,
tag,
state.try_current().map(|t| t.id()),
);
state.tasks.push(task);
task_id
})
Expand Down Expand Up @@ -593,31 +638,19 @@ impl ExecutionState {
}
}

trace!(?runnable, next_task=?self.next_task);
self.span.in_scope(|| trace!(?runnable, next_task=?self.next_task));

Ok(())
}

/// Set the next task as the current task, and update our tracing span
/// Set the next task as the current task
fn advance_to_next_task(&mut self) {
debug_assert_ne!(self.next_task, ScheduledTask::None);
let previous_task = self.current_task;
self.current_task = self.next_task.take();

tracing::dispatcher::get_default(|subscriber| {
if let ScheduledTask::Some(previous) = previous_task {
if let Some(span_id) = self.get(previous).span.id() {
subscriber.exit(&span_id)
}
}

if let ScheduledTask::Some(tid) = self.current_task {
if let Some(span_id) = self.get(tid).span.id() {
subscriber.enter(&span_id);
}
self.current_schedule.push_task(tid);
}
});
if let ScheduledTask::Some(tid) = self.current_task {
self.current_schedule.push_task(tid);
}
}

// Sets the `tag` field of the current task.
Expand Down
32 changes: 32 additions & 0 deletions src/runtime/runner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,37 @@ use std::thread;
use std::time::Instant;
use tracing::{span, Level};

// A helper struct which on `drop` exits all current spans, then enters the span which was entered when it was constructed.
sarsko marked this conversation as resolved.
Show resolved Hide resolved
// The reason this exist is to solve the "span-stacking" issue which occurs when there is a panic inside `run` which is
sarsko marked this conversation as resolved.
Show resolved Hide resolved
// then caught by `panic::catch_unwind()` (such as when Shuttle is run inside proptest).
// In other words: it enables correct spans when doing proptest minimization.
struct ResetSpanOnDrop {
span_id: Option<tracing::span::Id>,
}

impl ResetSpanOnDrop {
#[must_use]
sarsko marked this conversation as resolved.
Show resolved Hide resolved
fn new() -> Self {
Self {
span_id: tracing::Span::current().id(),
}
}
}

impl Drop for ResetSpanOnDrop {
// Exits all current spans, then enters the span which was entered when `self` was constructed.
fn drop(&mut self) {
tracing::dispatcher::get_default(|subscriber| {
while let Some(span_id) = tracing::Span::current().id().as_ref() {
subscriber.exit(span_id);
}
if let Some(span_id) = self.span_id.as_ref() {
subscriber.enter(span_id);
}
});
}
}

/// A `Runner` is the entry-point for testing concurrent code.
///
/// It takes as input a function to test and a `Scheduler` to run it under. It then executes that
Expand Down Expand Up @@ -43,6 +74,7 @@ impl<S: Scheduler + 'static> Runner<S> {
where
F: Fn() + Send + Sync + 'static,
{
let _span_drop_guard = ResetSpanOnDrop::new();
// Share continuations across executions to avoid reallocating them
// TODO it would be a lot nicer if this were a more generic "context" thing that we passed
// TODO around explicitly rather than being a thread local
Expand Down
42 changes: 31 additions & 11 deletions src/runtime/task/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ use std::future::Future;
use std::rc::Rc;
use std::sync::Arc;
use std::task::{Context, Waker};
use tracing::{event, field, info_span, Level, Span};

pub(crate) mod clock;
pub(crate) mod waker;
Expand Down Expand Up @@ -89,7 +90,11 @@ pub(crate) struct Task {
name: Option<String>,

local_storage: StorageMap,
pub(super) span: tracing::Span,

// The `Span` containing the `Task`s `id` and the current step count (if step count recording is enabled)
sarsko marked this conversation as resolved.
Show resolved Hide resolved
pub(super) step_span: Span,
// The current `Span` of the `Task`. We have to have it by ownership in order for the `Span` to not get dropped while the task is switched out.
pub(super) span: Span,

// Arbitrarily settable tag which is inherited from the parent.
tag: Option<Arc<dyn Tag>>,
Expand All @@ -104,9 +109,10 @@ impl Task {
id: TaskId,
name: Option<String>,
clock: VectorClock,
parent_span: tracing::Span,
parent_id: Option<tracing::span::Id>,
schedule_len: usize,
tag: Option<Arc<dyn Tag>>,
current_task: Option<TaskId>,
) -> Self
where
F: FnOnce() + Send + 'static,
Expand All @@ -117,11 +123,8 @@ impl Task {
let waker = make_waker(id);
let continuation = Rc::new(RefCell::new(continuation));

let span = if name == Some("main-thread".to_string()) {
parent_span
} else {
tracing::info_span!(parent: parent_span.id(), "step", i = schedule_len, task = id.0)
};
let step_span = info_span!(parent: parent_id.clone(), "step", task = id.0, i = field::Empty);
let span = step_span.clone();

let mut task = Self {
id,
Expand All @@ -134,6 +137,7 @@ impl Task {
detached: false,
park_state: ParkState::default(),
name,
step_span,
span,
local_storage: StorageMap::new(),
tag: None,
Expand All @@ -143,6 +147,9 @@ impl Task {
task.set_tag(tag);
}

info_span!(parent: parent_id, "new_task", parent = ?current_task, i = schedule_len)
.in_scope(|| event!(Level::INFO, "created task: {:?}", task.id));

task
}

Expand All @@ -153,14 +160,25 @@ impl Task {
id: TaskId,
name: Option<String>,
clock: VectorClock,
parent_span: tracing::Span,
parent_id: Option<tracing::span::Id>,
sarsko marked this conversation as resolved.
Show resolved Hide resolved
schedule_len: usize,
tag: Option<Arc<dyn Tag>>,
current_task: Option<TaskId>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be parent_task_id

) -> Self
where
F: FnOnce() + Send + 'static,
{
Self::new(f, stack_size, id, name, clock, parent_span, schedule_len, tag)
Self::new(
f,
stack_size,
id,
name,
clock,
parent_id,
schedule_len,
tag,
current_task,
)
}

#[allow(clippy::too_many_arguments)]
Expand All @@ -170,9 +188,10 @@ impl Task {
id: TaskId,
name: Option<String>,
clock: VectorClock,
parent_span: tracing::Span,
parent_id: Option<tracing::span::Id>,
sarsko marked this conversation as resolved.
Show resolved Hide resolved
schedule_len: usize,
tag: Option<Arc<dyn Tag>>,
current_task: Option<TaskId>,
) -> Self
where
F: Future<Output = ()> + Send + 'static,
Expand All @@ -192,9 +211,10 @@ impl Task {
id,
name,
clock,
parent_span,
parent_id,
schedule_len,
tag,
current_task,
)
}

Expand Down
1 change: 1 addition & 0 deletions tests/basic/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,5 @@ mod shrink;
mod tag;
mod thread;
mod timeout;
mod tracing;
mod uncontrolled_nondeterminism;
Loading