-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recreate or wrap exceptions thrown by async transport implementations to preserve caller stack traces #656
Recreate or wrap exceptions thrown by async transport implementations to preserve caller stack traces #656
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cursory review for now ...
- What is the the user experience change with this?
- Anything breaking in a semver sense?
- Let's also update some user guides on exceptions/error handling as part of this PR?
assertTrue(e.getMessage().contains("transient setting [no_idea_what_you_are_talking_about], not recognized")); | ||
} catch (Exception e) { | ||
OpenSearchException openSearchException; | ||
if (e instanceof OpenSearchException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't expect conditionals in tests, is this not predictable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's predictable but now varies per implementation.
This test is run against the RestClient transport and the Apache HTTP 5 transports. The RestClient transport throws OpenSearchException directly and since its transport is synchronous, this exception's stack trace contains a reference to the caller. The HC5 transport is always asynchronous and while the underlying logic will throw an OpenSearchException, that logic is executing on a pooled thread. That OpenSearchException then has to be wrapped in order to preserve a reference to the caller in the thrown exception's stack trace. I cannot construct an OpenSearchException wrapping an OpenSearchException (which would be suspicious anyways), so this now has to catch any OpenSearchException and IOException. I can modify the catch blocks to be multicatches.
The correct way to model this is to parameterize each implementation of this abstract class by the type of exception it should be expected to throw and use the assertThrows
pattern here instead of try-catch
.
This modeling issue combined with #656 (comment) are two overarching pieces of test tech debt I think should be handled separately from this PR, but would enable more comprehensive transport-agnostic integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. Open GitHub issues for these leftovers so we don't forget / someone else can help.
import org.apache.logging.log4j.core.util.Throwables; | ||
import org.junit.Test; | ||
|
||
public abstract class AbstractAsyncStracktraceIT extends OpenSearchJavaClientTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be able to mimic the existing pattern for RestClient and Apache HTTP5 integration tests and write one abstract test class that both Apache HTTP5 and AWS SDK V2 clients can use. I don't have it configured correctly here. When the CI suite runs these tests the AWS test is executing when it shouldn't be.
Altogether it looks like there's no great universal parent class that achieves exactly what I'm looking for but OpenSearchJavaClientTestCase
seems to be the closest thing to it.
assertTrue(e.getMessage().contains("transient setting [no_idea_what_you_are_talking_about], not recognized")); | ||
} catch (Exception e) { | ||
OpenSearchException openSearchException; | ||
if (e instanceof OpenSearchException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. Open GitHub issues for these leftovers so we don't forget / someone else can help.
@dblock I don't think we should be doing this change for 2 reasons:
|
While it's generally acceptable to let RuntimeExceptions propagate, it's expected that the stacktrace will include your caller. Would it be acceptable to re-wrap with a different runtime exception here? Perhaps OpenSearchRuntimeException?
Is this not permitted in 3.0.0? Since it's unreleased, such a breaking change would be permitted by semver. |
Let's take an example of the
Sure, for 3.x we are fine, we won't be backporting it which could be a maintenance issue (we will have to support 2 different behaviours across 2.x and 3.x branches). |
That one single place is on the code path of every call OpenSearchClient makes when using async transport. A In some cases the OpenSearchClient uses async transport for synchronous calls: for example, when using the Apache HC5 transport. This also happens when making synchronous calls using the AWS async client which we reuse across synchronous and asynchronous calls to enable connection reuse. Notably we have no synchronous transport we can use in tests (see #631) and this makes it much harder to debug new code. I think the root of the problem is really that OpenSearchClient can be built with async transport when it only offers synchronous APIs. In contrast AWS clients offered by the v2 Java SDK are completely split between sync and async variants and you cannot construct one with a transport of the other kind. Clients built with the other kind of transport compromise their behavior: an async client built with sync transport must wrap synchronous calls with Futures or in Executors (RestClientTransport does the first). A sync client built with async transport cannot offer the same API for thrown exceptions since it must unwrap exceptions which will have obfuscated stack traces or throw their wrapping completion/execution exceptions (HC5/AwsSdkAsync do the first). The solution is to either accept a compromise or to disable this kind of "square peg round hole" construction. I think that obfuscating caller traces is a bad compromise so I filed an issue to achieve consensus before attempting to change the behavior. I didn't dump too many hours into this, so if the consensus isn't there, I do have the functioning workaround I mentioned. I don't think it's a good pattern to require from clients of a library, especially if you're free to change behavior in a new major version.
Good to hear. We're upgrading from the RestHighLevelClient (1.3.2) and we won't need a backport. |
Thanks a lot for sharing your thoughts. One of the features that we're working on is opensearch-project/OpenSearch#6750, that would also include client instrumentation for sure. This is long shot for sure but to highlight the future options.
I hear you here, I think we should have put more thoughts into API design (we largely followed the |
Thanks @andrewparmet @reta. I think my questions remain, we should work backwards from the developer experience that we want. Keeping an open mind about the solution! |
@andrewparmet my apologies for the delay, so I have looked into 2 options here:
I believe we should keep the current implementation unchanged and recommend people to use the |
They are already divergent in the sense that one preserves the trace of its caller and the other does not even given equivalent server-side errors. Option 2 is the only way to be fully consistent. If HC5 offers synchronous transport then perhaps it shares a configuration type with the async transport. |
Oh that I missed out, my apologies for that, that is an argument towards making it right, looking into divergence now, thank you.
It offers but does not share anything, those are separate implementations |
Boo, but that's probably the right way to do it from their side. My ultimate position here then is to match the AWS SDK - async clients should take async transports and sync clients should take sync transports. Then, if you want to do what I'm doing and share a connection pool, do the wrapping and unwrapping of async calls yourself. Everybody gets exactly the API they want - all the way down to the way stacktraces look. |
Totally :)
I agree, the part which is difficult to translate seamlessly is @andrewparmet mind if I push a change to your pull request? TLDR; replicating the RestClient approach, essentially exactly the way you are trying to fix it here but with a bit more complicated logic. Thank you. |
Yeah this API would probably need to be renamed to distinguish between sync and async APIs, e.g. ApacheHttpClient5AsyncTransportBuilder. Feel free to hijack this PR - the magic button should be enabled. |
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
@dblock @reta Sorry for the delay here. This PR is now up to date with the latest idea on exception extraction implemented in the AWS transport. It includes two tests - one for the AWS transport and one for the HC5 transport - that lightly verify that stack trace propagation works. Ideally there would be one abstract test that can cover both transports, but I didn't see an easy way to do that given the current setup of the test suite. Let me know if there's another approach you'd prefer. It does look like this may be easier to do once Java 8 support is dropped and AWS integration tests are moved into the Java 11 test source set so code can be shared. Due to the new logic introduced in 03b214f, it does not completely cover all potential cases. In particular I think this strategy is slightly brittle as new exceptions may (will) be introduced that are not correctly captured. For example, this happened with OpenSearchClientException between the last time we touched this PR and now. |
Signed-off-by: Andrew Parmet <andrew@parmet.com>
Signed-off-by: Andrew Parmet <andrew@parmet.com>
2ccb4b9
to
b4f54d6
Compare
java-client/src/main/java/org/opensearch/client/transport/aws/AwsSdk2Transport.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Andrew Parmet <andrew@parmet.com>
ef9cd5a
to
f7814cd
Compare
Signed-off-by: Andrew Parmet <andrew@parmet.com>
java-client/src/main/java/org/opensearch/client/transport/aws/AwsSdk2Transport.java
Outdated
Show resolved
Hide resolved
Thanks a lot @andrewparmet , LGTM, does it solve your problem in an implementation shape like that? Thank you |
It does - we'll be able to trace our calls at the very least to the rethrown variations at the bottom of the extraction functions. |
@dblock do you mind to check it out? thank you! |
Looks great, merging. @andrewparmet Thanks for hanging in here with us. I think the project could use some documentation along the "handling errors" lines that could explain the various scenarios that you've fixed and how one should deal with them. Maybe you want to write something simple to start? |
… to preserve caller stack traces (#656) * wrap exceptions thrown by async transport implementations in IOException Signed-off-by: Andrew Parmet <andrew@parmet.com> * license headers Signed-off-by: Andrew Parmet <andrew@parmet.com> * changelog Signed-off-by: Andrew Parmet <andrew@parmet.com> * tolerate null causes Signed-off-by: Andrew Parmet <andrew@parmet.com> * fix tests Signed-off-by: Andrew Parmet <andrew@parmet.com> * remove print Signed-off-by: Andrew Parmet <andrew@parmet.com> * one more license header Signed-off-by: Andrew Parmet <andrew@parmet.com> * fix last non-aws test Signed-off-by: Andrew Parmet <andrew@parmet.com> * use multicatch to restrict caught exceptions Signed-off-by: Andrew Parmet <andrew@parmet.com> * Replicate the RestClient exception wrapping for async invocation flow Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * replicate hc5 exception extraction strategy in aws transport Signed-off-by: Andrew Parmet <andrew@parmet.com> * move other tests Signed-off-by: Andrew Parmet <andrew@parmet.com> * lint Signed-off-by: Andrew Parmet <andrew@parmet.com> * delete aws test for now; add support for OpenSearchClientException Signed-off-by: Andrew Parmet <andrew@parmet.com> * reintroduce an aws test, sadly, not extending the abstract case Signed-off-by: Andrew Parmet <andrew@parmet.com> * java 8 Signed-off-by: Andrew Parmet <andrew@parmet.com> * replicate ISE Signed-off-by: Andrew Parmet <andrew@parmet.com> * poke Signed-off-by: Andrew Parmet <andrew@parmet.com> * handle some netty-specific channel exceptions Signed-off-by: Andrew Parmet <andrew@parmet.com> * poke Signed-off-by: Andrew Parmet <andrew@parmet.com> * nevermind netty Signed-off-by: Andrew Parmet <andrew@parmet.com> * io before rt Signed-off-by: Andrew Parmet <andrew@parmet.com> * no hc5 Signed-off-by: Andrew Parmet <andrew@parmet.com> --------- Signed-off-by: Andrew Parmet <andrew@parmet.com> Signed-off-by: Andriy Redko <andriy.redko@aiven.io> Co-authored-by: Andriy Redko <andriy.redko@aiven.io> (cherry picked from commit db50b7d) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
… to preserve caller stack traces (#656) (#1006) * wrap exceptions thrown by async transport implementations in IOException * license headers * changelog * tolerate null causes * fix tests * remove print * one more license header * fix last non-aws test * use multicatch to restrict caught exceptions * Replicate the RestClient exception wrapping for async invocation flow * replicate hc5 exception extraction strategy in aws transport * move other tests * lint * delete aws test for now; add support for OpenSearchClientException * reintroduce an aws test, sadly, not extending the abstract case * java 8 * replicate ISE * poke * handle some netty-specific channel exceptions * poke * nevermind netty * io before rt * no hc5 --------- (cherry picked from commit db50b7d) Signed-off-by: Andrew Parmet <andrew@parmet.com> Signed-off-by: Andriy Redko <andriy.redko@aiven.io> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Andriy Redko <andriy.redko@aiven.io>
Description
Ensures that callers of synchronous methods will see their method name in the stracktrace of an exception thrown by an asynchronous transport.
Issues Resolved
Fixes #653.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.