From 2fd4c209f1e05da946d51dd1908893663350aac3 Mon Sep 17 00:00:00 2001 From: SergeyMenshykh Date: Mon, 26 Jun 2023 22:38:25 +0100 Subject: [PATCH 1/5] First version of ADR for exception handling in SK. --- docs/decisions/0004-error-hadling.md | 59 ++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 docs/decisions/0004-error-hadling.md diff --git a/docs/decisions/0004-error-hadling.md b/docs/decisions/0004-error-hadling.md new file mode 100644 index 000000000000..d62adff95b08 --- /dev/null +++ b/docs/decisions/0004-error-hadling.md @@ -0,0 +1,59 @@ +--- +# These are optional elements. Feel free to remove any of them. +status: proposed +date: 2023-06-23 +deciders: shawncal +consulted: stephentoub +informed: +--- +# Error handling improvements + +## Disclaimer +This ADR describes problems and their solutions for improving the error handling aspect of SK. It does not address logging, resiliency, or observability aspects. + +## Context and Problem Statement + +Currently, there are several aspects of error handling in SK that can be enhanced to simplify SK code and SK client code, while also ensuring consistency and maintainability: + +- **Exception propagation**. SK has a few public methods, like Kernel.RunAsync and SKFunction.InvokeAsync, that handle exceptions in a non-standard way. Instead of throwing exceptions, they catch and store them within the SKContext. This deviates from the standard error handling approach in .NET, which expects a method to either execute successfully if its contract is fulfilled or throw an exception if the contract is violated. Consequently, when working with the .NET version of the SK SDK, it becomes challenging to determine whether a method executed successfully or failed without analyzing specific properties of the SKContext instance. This can lead to a frustrating experience for developers using the .NET SK SDK. + +- **Improper exception usage**. Some SK components use custom SK exceptions instead of standard .NET exceptions to indicate invalid arguments, configuration issues, and so on. This deviates from the standard approach for error handling in .NET and may frustrate SK client code developers. + +- **Exception hierarchy inconsistency**. Half of the custom SK exceptions are derived from SKException, while the other half are directly derived from Exception. This inconsistency in the exception hierarchy does not contribute to a cohesive exception model. + +- **Missing original exception details**. Certain SK exceptions do not preserve the original failure or exception details and do not expose them through their properties. This omission prevents SK client code from understanding the problem and handling it properly. + +- **No consolidated way for exception handling**. In some cases, SK has an exception type per component implementation, making it impossible for SK client code to handle them in a consolidated manner. Instead of having a single catch block, SK client code needs to include a catch block for each component implementation. Moreover, SK client code needs to be updated every time a new component implementation is added or removed. + +## Solution + +To address the identified issues, the following steps can be taken: + +- **Exception propagation**: To enable SK exceptions to propagate to SK client code, all SK components that currently do not do so should be modified to rethrow the exception instead of suppressing it. This allows SK client code to use try/catch blocks for handling and catching exceptions. + +- **Improper exception usage**: Modify SK code to throw .NET standard exceptions, such as ArgumentOutOfRangeException or ArgumentNullException, when class argument values are not provided or are invalid, instead of throwing custom SK exceptions. Analyze SK exception usage to identify other potential areas where standard .NET exceptions can be used instead. + +- **Exception hierarchy inconsistency**: Refactor SK custom exceptions that are not derived from SKException to derive from SKException. This ensures consistency in the SK exception hierarchy. + +- **Missing original exception details**: Identify all cases where the original exception or failure is not preserved either as an inner exception or as a property/message of the rethrown exception, and address them. For example, AI service connectors that use HttClient should preserve the original status code by either setting it as an inner exception or including it as part of the rethrown exception's property. + +- **No consolidated way for exception handling**: + - [Memory connectors]: Introduce a new exception called VectorDbStorageException, along with derived exceptions like UpsertRecordException and RemoveRecordException, if additional details need to be captured and propagated to SK client code. Refactor all SK memory component implementations to use these exceptions instead of implementation-specific memory storage exceptions. + - [AI service connectors]: Restrict the usage of AIException to AI service connectors only and replace its usage in other SK components with more relevant exceptions. For cases where AIException is thrown, use HttpOperationException describved below as an inner exception to allow SK client code to access the original status code and exception/error details, if necessary. + - [HttpOperationException]: Create a new exception called HttpOperationException, which includes a StatusCode property, and implement the necessary logic to map the exception from HttpStatusCode, HttpRequestException, or Azure.RequestFailedException. Update existing SK code that interacts with the HTTP stack to throw HttpOperationException in case of a failed HTTP request and assign the original exception as its inner exception. + +- **Other**: + - Remove any code that wraps unhandled exceptions into AIException or any other SK exception solely for the purpose of wrapping. In most cases, this code does not provide useful information to help SK client code handle the exception, apart from a generic and uninformative "Something went wrong" message. Regardless of whether the unhandled exception is wrapped or not, SK client code cannot handle or recover from it without prior knowledge of the exception and proper exception handling logic. + +## Questions and consideration + +- Should the exception hierarchy follow the SK components hierarchy? For example, should there be a PlanningException for planners and a MemoryStorageException for memory storage connectors? What are the benefits for SK client code? Can the SK client code operate on the exception? + In order to handle an exception successfully, the client code needs to know as many details as possible about the failed operation, such as exception type, status code, and some other operation related properties. This suggests that specific exception types with strongly typed properties should be preferred over generic/common exception types. If that's the case, what is the purpose of having base generic exceptions? + On the other hand, having base exception types per component or set of related operations can be convenient from a code reusability perspective because derived types can inherit common component exception properties relevant to that component. +- Should all custom SK exceptions be derived from the SKException type? Currently, SKException derives from the Exception type and does not add any additional information. What value can SK client code get from catching these exceptions compared to catching them as Exception types? Should SKException be used in all cases when SK needs to indicate that something went wrong without providing details through strongly typed properties, and neither custom nor standard exception types exist that could be used for that purpose? +- Should we keep the IsCriticalException functionality that SK currently has to avoid handling several system/critical exceptions such as OutOfMemoryException and StackOverflowException? Some of them cannot be intercepted. +- An idea has been proposed to add SK wide error codes to exceptions in order to identify and locate the source of the exception. The question is, is it necessary? Aren't the exception type, message, values of its properties, and stack trace enough to precisely identify the exception and locate its source? + +## Decision Outcome + +TBD From 63682ad0eea5c8155ac4a650b1a0b57aaa50f556 Mon Sep 17 00:00:00 2001 From: SergeyMenshykh Date: Mon, 26 Jun 2023 22:53:07 +0100 Subject: [PATCH 2/5] Fix for typo in the file name --- docs/decisions/{0004-error-hadling.md => 0004-error-handling.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/decisions/{0004-error-hadling.md => 0004-error-handling.md} (100%) diff --git a/docs/decisions/0004-error-hadling.md b/docs/decisions/0004-error-handling.md similarity index 100% rename from docs/decisions/0004-error-hadling.md rename to docs/decisions/0004-error-handling.md From c77644f40f07f573baa41ebdfa0812793c5dc651 Mon Sep 17 00:00:00 2001 From: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com> Date: Tue, 27 Jun 2023 12:58:51 +0100 Subject: [PATCH 3/5] Update docs/decisions/0004-error-handling.md Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com> --- docs/decisions/0004-error-handling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/decisions/0004-error-handling.md b/docs/decisions/0004-error-handling.md index d62adff95b08..24eaf3b1f029 100644 --- a/docs/decisions/0004-error-handling.md +++ b/docs/decisions/0004-error-handling.md @@ -35,7 +35,7 @@ To address the identified issues, the following steps can be taken: - **Exception hierarchy inconsistency**: Refactor SK custom exceptions that are not derived from SKException to derive from SKException. This ensures consistency in the SK exception hierarchy. -- **Missing original exception details**: Identify all cases where the original exception or failure is not preserved either as an inner exception or as a property/message of the rethrown exception, and address them. For example, AI service connectors that use HttClient should preserve the original status code by either setting it as an inner exception or including it as part of the rethrown exception's property. +- **Missing original exception details**: Identify all cases where the original exception or failure is not preserved either as an inner exception or as a property/message of the rethrown exception, and address them. For example, AI service connectors that use HttpClient should preserve the original status code by either setting it as an inner exception or including it as part of the rethrown exception's property. - **No consolidated way for exception handling**: - [Memory connectors]: Introduce a new exception called VectorDbStorageException, along with derived exceptions like UpsertRecordException and RemoveRecordException, if additional details need to be captured and propagated to SK client code. Refactor all SK memory component implementations to use these exceptions instead of implementation-specific memory storage exceptions. From e66f46fe5fd73fcb9011b891d2dfce39f19dfbd3 Mon Sep 17 00:00:00 2001 From: SergeyMenshykh Date: Tue, 27 Jun 2023 13:05:24 +0100 Subject: [PATCH 4/5] addressing comments --- docs/decisions/0004-error-handling.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/docs/decisions/0004-error-handling.md b/docs/decisions/0004-error-handling.md index d62adff95b08..91e2e4afd848 100644 --- a/docs/decisions/0004-error-handling.md +++ b/docs/decisions/0004-error-handling.md @@ -35,25 +35,23 @@ To address the identified issues, the following steps can be taken: - **Exception hierarchy inconsistency**: Refactor SK custom exceptions that are not derived from SKException to derive from SKException. This ensures consistency in the SK exception hierarchy. -- **Missing original exception details**: Identify all cases where the original exception or failure is not preserved either as an inner exception or as a property/message of the rethrown exception, and address them. For example, AI service connectors that use HttClient should preserve the original status code by either setting it as an inner exception or including it as part of the rethrown exception's property. +- **Missing original exception details**: Identify all cases where the original exception is not preserved as an inner exception of the rethrown exception, and address them. +For example, AI service connectors that use HttClient must preserve the original exception by assigning it as an inner exception of the rethrown exception, and retain the original status code by setting it as the StatusCode property of the rethrown exception. By having the original exception as the inner exception of the rethrown one, SK client code can access any new information or details added to the original exception, should it undergo changes (such as the addition of new properties), before SK decides to incorporate this new information into the rethrown exception. - **No consolidated way for exception handling**: - - [Memory connectors]: Introduce a new exception called VectorDbStorageException, along with derived exceptions like UpsertRecordException and RemoveRecordException, if additional details need to be captured and propagated to SK client code. Refactor all SK memory component implementations to use these exceptions instead of implementation-specific memory storage exceptions. - - [AI service connectors]: Restrict the usage of AIException to AI service connectors only and replace its usage in other SK components with more relevant exceptions. For cases where AIException is thrown, use HttpOperationException describved below as an inner exception to allow SK client code to access the original status code and exception/error details, if necessary. + - [Memory connectors]: Introduce a new exception called MemoryStoreException, along with derived exceptions like UpsertRecordException and RemoveRecordException, if additional details need to be captured and propagated to SK client code. Refactor all SK memory component implementations to use these exceptions instead of implementation-specific memory storage exceptions. + - [AI service connectors]: Restrict the usage of AIException to AI service connectors only and replace its usage in other SK components with more relevant exceptions. For cases where AIException is thrown, use HttpOperationException described below as an inner exception to allow SK client code to access the original status code and exception/error details, if necessary. - [HttpOperationException]: Create a new exception called HttpOperationException, which includes a StatusCode property, and implement the necessary logic to map the exception from HttpStatusCode, HttpRequestException, or Azure.RequestFailedException. Update existing SK code that interacts with the HTTP stack to throw HttpOperationException in case of a failed HTTP request and assign the original exception as its inner exception. - **Other**: - Remove any code that wraps unhandled exceptions into AIException or any other SK exception solely for the purpose of wrapping. In most cases, this code does not provide useful information to help SK client code handle the exception, apart from a generic and uninformative "Something went wrong" message. Regardless of whether the unhandled exception is wrapped or not, SK client code cannot handle or recover from it without prior knowledge of the exception and proper exception handling logic. + - Simplify the SK critical exception handling functionality by modifying the IsCriticalException extension method to exclude handling of StackOverflowException and OutOfMemoryException exceptions. This is because the former exception is not thrown, so the calling code won't be executed, while the latter exception doesn't necessarily prevent the execution of recovery code. -## Questions and consideration +## Questions and points to consider - Should the exception hierarchy follow the SK components hierarchy? For example, should there be a PlanningException for planners and a MemoryStorageException for memory storage connectors? What are the benefits for SK client code? Can the SK client code operate on the exception? In order to handle an exception successfully, the client code needs to know as many details as possible about the failed operation, such as exception type, status code, and some other operation related properties. This suggests that specific exception types with strongly typed properties should be preferred over generic/common exception types. If that's the case, what is the purpose of having base generic exceptions? On the other hand, having base exception types per component or set of related operations can be convenient from a code reusability perspective because derived types can inherit common component exception properties relevant to that component. - Should all custom SK exceptions be derived from the SKException type? Currently, SKException derives from the Exception type and does not add any additional information. What value can SK client code get from catching these exceptions compared to catching them as Exception types? Should SKException be used in all cases when SK needs to indicate that something went wrong without providing details through strongly typed properties, and neither custom nor standard exception types exist that could be used for that purpose? -- Should we keep the IsCriticalException functionality that SK currently has to avoid handling several system/critical exceptions such as OutOfMemoryException and StackOverflowException? Some of them cannot be intercepted. +- ~~Should we keep the IsCriticalException functionality that SK currently has to avoid handling several system/critical exceptions such as OutOfMemoryException and StackOverflowException? Some of them cannot be intercepted.~~ - An idea has been proposed to add SK wide error codes to exceptions in order to identify and locate the source of the exception. The question is, is it necessary? Aren't the exception type, message, values of its properties, and stack trace enough to precisely identify the exception and locate its source? - -## Decision Outcome - -TBD From 96d4b2c44360ba8deb36511b3750c7fe1b53c4f6 Mon Sep 17 00:00:00 2001 From: SergeyMenshykh Date: Thu, 29 Jun 2023 21:29:26 +0100 Subject: [PATCH 5/5] addressing feedback from the latest "SK error handling" meeting. --- docs/decisions/0004-error-handling.md | 46 ++++++++++----------------- 1 file changed, 16 insertions(+), 30 deletions(-) diff --git a/docs/decisions/0004-error-handling.md b/docs/decisions/0004-error-handling.md index c173b6f309b2..4f98f03f7003 100644 --- a/docs/decisions/0004-error-handling.md +++ b/docs/decisions/0004-error-handling.md @@ -19,39 +19,25 @@ Currently, there are several aspects of error handling in SK that can be enhance - **Improper exception usage**. Some SK components use custom SK exceptions instead of standard .NET exceptions to indicate invalid arguments, configuration issues, and so on. This deviates from the standard approach for error handling in .NET and may frustrate SK client code developers. -- **Exception hierarchy inconsistency**. Half of the custom SK exceptions are derived from SKException, while the other half are directly derived from Exception. This inconsistency in the exception hierarchy does not contribute to a cohesive exception model. +- **Exception hierarchy**. Half of the custom SK exceptions are derived from SKException, while the other half are directly derived from Exception. This inconsistency in the exception hierarchy does not contribute to a cohesive exception model. -- **Missing original exception details**. Certain SK exceptions do not preserve the original failure or exception details and do not expose them through their properties. This omission prevents SK client code from understanding the problem and handling it properly. - -- **No consolidated way for exception handling**. In some cases, SK has an exception type per component implementation, making it impossible for SK client code to handle them in a consolidated manner. Instead of having a single catch block, SK client code needs to include a catch block for each component implementation. Moreover, SK client code needs to be updated every time a new component implementation is added or removed. - -## Solution - -To address the identified issues, the following steps can be taken: +- **Unnecessary and verbose exceptions** A few SK components, such as the Kernel or Planner, have exceptions at their level, namely PlanningException or KernelException, that are not truly necessary and can be easily replaced by SKException and a few of its derivatives. SK clients might become dependent on them, making it challenging to remove them later if SK needs to discontinue their usage. Additionally, SK has an exception type for each SK memory connector - PineconeMemoryException, QdrantMemoryException that does not add any additional information and only differs by name while having the same member signatures. This makes it impossible for SK client code to handle them in a consolidated manner. Instead of having a single catch block, SK client code needs to include a catch block for each component implementation. Moreover, SK client code needs to be updated every time a new component implementation is added or removed. -- **Exception propagation**: To enable SK exceptions to propagate to SK client code, all SK components that currently do not do so should be modified to rethrow the exception instead of suppressing it. This allows SK client code to use try/catch blocks for handling and catching exceptions. - -- **Improper exception usage**: Modify SK code to throw .NET standard exceptions, such as ArgumentOutOfRangeException or ArgumentNullException, when class argument values are not provided or are invalid, instead of throwing custom SK exceptions. Analyze SK exception usage to identify other potential areas where standard .NET exceptions can be used instead. - -- **Exception hierarchy inconsistency**: Refactor SK custom exceptions that are not derived from SKException to derive from SKException. This ensures consistency in the SK exception hierarchy. - -- **Missing original exception details**: Identify all cases where the original exception is not preserved as an inner exception of the rethrown exception, and address them. -For example, AI service connectors that use HttpClient must preserve the original exception by assigning it as an inner exception of the rethrown exception, and retain the original status code by setting it as the StatusCode property of the rethrown exception. By having the original exception as the inner exception of the rethrown one, SK client code can access any new information or details added to the original exception, should it undergo changes (such as the addition of new properties), before SK decides to incorporate this new information into the rethrown exception. +- **Missing original exception details**. Certain SK exceptions do not preserve the original failure or exception details and do not expose them through their properties. This omission prevents SK client code from understanding the problem and handling it properly. -- **No consolidated way for exception handling**: - - [Memory connectors]: Introduce a new exception called MemoryStoreException, along with derived exceptions like UpsertRecordException and RemoveRecordException, if additional details need to be captured and propagated to SK client code. Refactor all SK memory component implementations to use these exceptions instead of implementation-specific memory storage exceptions. - - [AI service connectors]: Restrict the usage of AIException to AI service connectors only and replace its usage in other SK components with more relevant exceptions. For cases where AIException is thrown, use HttpOperationException described below as an inner exception to allow SK client code to access the original status code and exception/error details, if necessary. - - [HttpOperationException]: Create a new exception called HttpOperationException, which includes a StatusCode property, and implement the necessary logic to map the exception from HttpStatusCode, HttpRequestException, or Azure.RequestFailedException. Update existing SK code that interacts with the HTTP stack to throw HttpOperationException in case of a failed HTTP request and assign the original exception as its inner exception. +## Decision Drivers -- **Other**: - - Remove any code that wraps unhandled exceptions into AIException or any other SK exception solely for the purpose of wrapping. In most cases, this code does not provide useful information to help SK client code handle the exception, apart from a generic and uninformative "Something went wrong" message. Regardless of whether the unhandled exception is wrapped or not, SK client code cannot handle or recover from it without prior knowledge of the exception and proper exception handling logic. - - Simplify the SK critical exception handling functionality by modifying the IsCriticalException extension method to exclude handling of StackOverflowException and OutOfMemoryException exceptions. This is because the former exception is not thrown, so the calling code won't be executed, while the latter exception doesn't necessarily prevent the execution of recovery code. +- Exceptions should be propagated to the SK client code instead of being stored in the SKContext. This adjustment will bring SK error handling in line with the .NET approach. +- The SK exception hierarchy should be designed following the principle of "less is more." It is easier to add new exceptions later, but removing them can be challenging. +- .NET standard exception types should be preferred over SK custom ones because they are easily recognizable, do not require any maintenance, can cover common error scenarios, and provide meaningful and standardized error messages. +- Exceptions should not be wrapped in SK exceptions when passing them up to a caller, unless it helps in constructing actionable logic for either SK or SK client code. -## Questions and points to consider +## Considered Options -- Should the exception hierarchy follow the SK components hierarchy? For example, should there be a PlanningException for planners and a MemoryStorageException for memory storage connectors? What are the benefits for SK client code? Can the SK client code operate on the exception? - In order to handle an exception successfully, the client code needs to know as many details as possible about the failed operation, such as exception type, status code, and some other operation related properties. This suggests that specific exception types with strongly typed properties should be preferred over generic/common exception types. If that's the case, what is the purpose of having base generic exceptions? - On the other hand, having base exception types per component or set of related operations can be convenient from a code reusability perspective because derived types can inherit common component exception properties relevant to that component. -- Should all custom SK exceptions be derived from the SKException type? Currently, SKException derives from the Exception type and does not add any additional information. What value can SK client code get from catching these exceptions compared to catching them as Exception types? Should SKException be used in all cases when SK needs to indicate that something went wrong without providing details through strongly typed properties, and neither custom nor standard exception types exist that could be used for that purpose? -- ~~Should we keep the IsCriticalException functionality that SK currently has to avoid handling several system/critical exceptions such as OutOfMemoryException and StackOverflowException? Some of them cannot be intercepted.~~ -- An idea has been proposed to add SK wide error codes to exceptions in order to identify and locate the source of the exception. The question is, is it necessary? Aren't the exception type, message, values of its properties, and stack trace enough to precisely identify the exception and locate its source? +- Simplify existing SK exception hierarchy by removing all custom exceptions types except the SKException one and any other type that is actionable. Use SKException type instead of the removed ones unless more details need to be conveyed in which case create a derived specific exception. +- Modify SK code to throw .NET standard exceptions, such as ArgumentOutOfRangeException or ArgumentNullException, when class argument values are not provided or are invalid, instead of throwing custom SK exceptions. Analyze SK exception usage to identify and fix other potential areas where standard .NET exceptions can be used instead. +- Remove any code that wraps unhandled exceptions into AIException or any other SK exception solely for the purpose of wrapping. In most cases, this code does not provide useful information to action on it, apart from a generic and uninformative "Something went wrong" message. +- Identify all cases where the original exception is not preserved as an inner exception of the rethrown SK exception, and address them. +- Create a new exception HttpOperationException, which includes a StatusCode property, and implement the necessary logic to map the exception from HttpStatusCode, HttpRequestException, or Azure.RequestFailedException. Update existing SK code that interacts with the HTTP stack to throw HttpOperationException in case of a failed HTTP request and assign the original exception as its inner exception. +- Modify all SK components that currently store exceptions to SK context to rethrow them instead. +- Simplify the SK critical exception handling functionality by modifying the IsCriticalException extension method to exclude handling of StackOverflowException and OutOfMemoryException exceptions. This is because the former exception is not thrown, so the calling code won't be executed, while the latter exception doesn't necessarily prevent the execution of recovery code. \ No newline at end of file