Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undo action error type could be more explicit [incomplete] #283

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,16 @@

https://github.com/oxidecomputer/steno/compare/v0.4.0\...HEAD[Full list of commits]

=== Breaking changes

* The error type for undo actions is now called `UndoActionPermanentError`. It used to be simply `anyhow::Error`. Steno's behavior when undo actions return an error has _not_ changed. This explicit type was added to better communicate that if you return an error from an undo action, the saga will become stuck. For convenience, there are `From` impls to produce an `UndoActionPermanentError` from an `anyhow::Error` or an `ActionError`.
+
**What you need to do:** For any function implementing an undo action:
+
** If the return type was `Result<(), anyhow::Error>`, change it to `Result<(), UndoActionError>`. If the return type was `UndoResult`, then you don't need to change the return type.
*** If you were directly returning `Err(some_anyhow_error)`, you can instead return `Err(UndoActionPermanentError::from(some_anyhow_error))`.
*** If you were returning an error using `foo()?` that was _not_ an `anyhow::Error` or an `ActionError`, you can generally use `foo().map_err(anyhow::Error::from).map_err(UndoActionPermanentError::from)`. (Your error type must have been convertible to `anyhow::Error` for it to compile before. Now you're explicitly converting it to that, and then to `UndoActionPermanentError`.)

== 0.4.0 (released 2023-05-25)

https://github.com/oxidecomputer/steno/compare/v0.3.1\...v0.4.0[Full list of commits]
Expand Down
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "steno"
version = "0.4.1-dev"
version = "0.5.0-dev"
edition = "2021"
license = "Apache-2.0"
repository = "https://github.com/oxidecomputer/steno"
Expand Down
9 changes: 5 additions & 4 deletions examples/trip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ use steno::SagaName;
use steno::SagaResultErr;
use steno::SagaType;
use steno::SecClient;
use steno::UndoActionPermanentError;
use uuid::Uuid;

// This is where we're going: this program will collect payment and book a whole
Expand Down Expand Up @@ -283,7 +284,7 @@ async fn saga_charge_card(

async fn saga_refund_card(
action_context: ActionContext<TripSaga>,
) -> Result<(), anyhow::Error> {
) -> Result<(), UndoActionPermanentError> {
// Fetch the payment confirmation. The undo function is only ever invoked
// after the action function has succeeded. This node is called "payment",
// so we fetch our own action's output by looking up the data for "payment".
Expand All @@ -306,7 +307,7 @@ async fn saga_book_hotel(

async fn saga_cancel_hotel(
action_context: ActionContext<TripSaga>,
) -> Result<(), anyhow::Error> {
) -> Result<(), UndoActionPermanentError> {
// ...
let trip_context = action_context.user_data();
let confirmation: HotelReservation = action_context.lookup("hotel")?;
Expand All @@ -327,7 +328,7 @@ async fn saga_book_flight(

async fn saga_cancel_flight(
action_context: ActionContext<TripSaga>,
) -> Result<(), anyhow::Error> {
) -> Result<(), UndoActionPermanentError> {
// ...
let trip_context = action_context.user_data();
let confirmation: FlightReservation = action_context.lookup("flight")?;
Expand All @@ -348,7 +349,7 @@ async fn saga_book_car(

async fn saga_cancel_car(
action_context: ActionContext<TripSaga>,
) -> Result<(), anyhow::Error> {
) -> Result<(), UndoActionPermanentError> {
// ...
let trip_context = action_context.user_data();
let confirmation: CarReservation = action_context.lookup("car")?;
Expand Down
2 changes: 1 addition & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ pub use dag::SagaDag;
pub use dag::SagaId;
pub use dag::SagaName;
pub use saga_action_error::ActionError;
pub use saga_action_error::UndoActionError;
pub use saga_action_error::UndoActionPermanentError;
pub use saga_action_func::new_action_noop_undo;
pub use saga_action_func::ActionFunc;
pub use saga_action_func::ActionFuncResult;
Expand Down
19 changes: 15 additions & 4 deletions src/saga_action_error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -155,8 +155,19 @@ impl ActionError {
/// It should be expected that human intervention will be required to repair the
/// result of an undo action that has failed.
#[derive(Clone, Debug, Deserialize, Error, JsonSchema, Serialize)]
pub enum UndoActionError {
/// Undo action failed due to a consumer-specific error
#[error("undo action failed permanently: {source_error:#}")]
PermanentFailure { source_error: serde_json::Value },
#[error("undo action failed permanently: {message}")]
pub struct UndoActionPermanentError {
message: String,
}

impl From<anyhow::Error> for UndoActionPermanentError {
fn from(value: anyhow::Error) -> Self {
UndoActionPermanentError { message: format!("{:#}", value) }
}
}

impl From<ActionError> for UndoActionPermanentError {
fn from(value: ActionError) -> Self {
UndoActionPermanentError::from(anyhow::Error::from(value))
}
}
10 changes: 5 additions & 5 deletions src/saga_action_generic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
use crate::saga_action_error::ActionError;
use crate::saga_exec::ActionContext;
use crate::ActionName;
use crate::UndoActionPermanentError;
use futures::future::BoxFuture;
use serde::de::DeserializeOwned;
use serde::Serialize;
Expand Down Expand Up @@ -62,10 +63,7 @@ impl<T: Debug + DeserializeOwned + Serialize + Send + Sync + 'static> ActionData
pub type ActionResult = Result<Arc<serde_json::Value>, ActionError>;

/// Result of a saga undo action
// TODO-design what should the error type here be? Maybe something that can
// encompass "general framework error"? This might put the saga into a "needs
// attention" state?
pub type UndoResult = Result<(), anyhow::Error>;
pub type UndoResult = Result<(), UndoActionPermanentError>;

/// Building blocks of sagas
///
Expand Down Expand Up @@ -167,7 +165,9 @@ impl<UserType: SagaType> Action<UserType> for ActionInjectError {
fn undo_it(&self, _: ActionContext<UserType>) -> BoxFuture<'_, UndoResult> {
// We should never undo an action that failed. But this same impl is
// plugged into a saga when an "undo action" error is injected.
Box::pin(futures::future::err(anyhow::anyhow!("error injected")))
Box::pin(futures::future::err(UndoActionPermanentError::from(
anyhow::anyhow!("error injected"),
)))
}

fn name(&self) -> ActionName {
Expand Down
24 changes: 8 additions & 16 deletions src/saga_exec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ use crate::dag::InternalNode;
use crate::dag::NodeName;
use crate::rust_features::ExpectNone;
use crate::saga_action_error::ActionError;
use crate::saga_action_error::UndoActionError;
use crate::saga_action_generic::Action;
use crate::saga_action_generic::ActionConstant;
use crate::saga_action_generic::ActionData;
Expand All @@ -24,6 +23,7 @@ use crate::SagaLog;
use crate::SagaNodeEvent;
use crate::SagaNodeId;
use crate::SagaType;
use crate::UndoActionPermanentError;
use anyhow::anyhow;
use anyhow::ensure;
use anyhow::Context;
Expand All @@ -41,7 +41,6 @@ use petgraph::Direction;
use petgraph::Graph;
use petgraph::Incoming;
use petgraph::Outgoing;
use serde_json::json;
use std::collections::BTreeMap;
use std::collections::BTreeSet;
use std::convert::TryFrom;
Expand All @@ -58,7 +57,7 @@ use tokio::task::JoinHandle;
struct SgnsDone(Arc<serde_json::Value>);
struct SgnsFailed(ActionError);
struct SgnsUndone(UndoMode);
struct SgnsUndoFailed(UndoActionError);
struct SgnsUndoFailed(UndoActionPermanentError);

struct SagaNode<S: SagaNodeStateType> {
node_id: NodeIndex,
Expand Down Expand Up @@ -1272,21 +1271,14 @@ impl<UserType: SagaType> SagaExecutor<UserType> {
let action = &task_params.action;
let undo_error = futures::stream::iter(0..count)
.map(Ok::<u32, _>)
.try_for_each(|i| async move {
action
.undo_it(make_action_context())
.await
.with_context(|| format!("undo action attempt {}", i + 1))
.try_for_each(|_| async move {
action.undo_it(make_action_context()).await
})
.await;

if let Err(error) = undo_error {
let node = Box::new(SagaNode {
node_id,
state: SgnsUndoFailed(UndoActionError::PermanentFailure {
source_error: json!({ "message": format!("{:#}", error) }),
}),
});
let node =
Box::new(SagaNode { node_id, state: SgnsUndoFailed(error) });
SagaExecutor::finish_task(task_params, node).await;
} else {
let node = Box::new(SagaNode {
Expand Down Expand Up @@ -1513,7 +1505,7 @@ struct SagaExecLiveState {
/// Errors produced by failed actions.
node_errors: BTreeMap<NodeIndex, ActionError>,
/// Errors produced by failed undo actions.
undo_errors: BTreeMap<NodeIndex, UndoActionError>,
undo_errors: BTreeMap<NodeIndex, UndoActionPermanentError>,

/// Persistent state
sglog: SagaLog,
Expand Down Expand Up @@ -1741,7 +1733,7 @@ pub struct SagaResultErr {
/// details about the action failure
pub error_source: ActionError,
/// if an undo action also failed, details about that failure
pub undo_failure: Option<(NodeName, UndoActionError)>,
pub undo_failure: Option<(NodeName, UndoActionPermanentError)>,
}

/// Summarizes in-progress execution state of a saga
Expand Down
6 changes: 3 additions & 3 deletions src/saga_log.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
//! Persistent state for sagas

use crate::saga_action_error::ActionError;
use crate::saga_action_error::UndoActionError;
use crate::saga_action_error::UndoActionPermanentError;
use crate::SagaId;
use anyhow::anyhow;
use anyhow::Context;
Expand Down Expand Up @@ -86,7 +86,7 @@ pub enum SagaNodeEventType {
/// The undo action has finished
UndoFinished,
/// The undo action has failed
UndoFailed(UndoActionError),
UndoFailed(UndoActionPermanentError),
}

impl fmt::Display for SagaNodeEventType {
Expand Down Expand Up @@ -133,7 +133,7 @@ pub enum SagaNodeLoadStatus {
/// The undo action has finished successfully
UndoFinished,
/// The undo action has failed
UndoFailed(UndoActionError),
UndoFailed(UndoActionPermanentError),
}

impl SagaNodeLoadStatus {
Expand Down
6 changes: 3 additions & 3 deletions src/sec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1450,7 +1450,7 @@ mod test {
use super::*;
use crate::{
ActionContext, ActionError, ActionFunc, DagBuilder, Node, SagaId,
SagaName,
SagaName, UndoActionPermanentError,
};
use serde::{Deserialize, Serialize};
use slog::Drain;
Expand Down Expand Up @@ -1526,7 +1526,7 @@ mod test {
}
async fn undo_n1(
ctx: ActionContext<TestSaga>,
) -> Result<(), anyhow::Error> {
) -> Result<(), UndoActionPermanentError> {
ctx.user_data().call("undo_n1");
Ok(())
}
Expand All @@ -1539,7 +1539,7 @@ mod test {
}
async fn undo_n2(
ctx: ActionContext<TestSaga>,
) -> Result<(), anyhow::Error> {
) -> Result<(), UndoActionPermanentError> {
ctx.user_data().call("undo_n2");
Ok(())
}
Expand Down
4 changes: 1 addition & 3 deletions tests/test_smoke_run_recover_stuck_done.out
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,4 @@ failed at node: "instance_boot"
failed with error: error injected
FOLLOWED BY UNDO ACTION FAILURE
failed at node: "instance_id"
failed with error: undo action failed permanently: {
"message": "undo action attempt 1: error injected"
}
failed with error: undo action failed permanently: error injected
4 changes: 1 addition & 3 deletions tests/test_smoke_run_stuck.out
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,4 @@ failed at node: "instance_boot"
failed with error: error injected
FOLLOWED BY UNDO ACTION FAILURE
failed at node: "instance_id"
failed with error: undo action failed permanently: {
"message": "undo action attempt 1: error injected"
}
failed with error: undo action failed permanently: error injected
Loading