Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce simpler withSmartTags in place of the TaggedLogger #3366

Conversation

afkelsall
Copy link
Contributor

The original implementation of TaggedLogger became heavy, stateful and opinionated. We've since reconsidered this approach and instead simplified it greatly to just a function call withSmartTags doing away with all the stateful builders etc.

For this draft PR I have left the tests as similar as possible by introducing a TestTaggedLogger at the bottom allowing them to remain largely unchanged. I wanted to do this to prove the new approach satisfied those tests.

I'll convert these tests over to withSmartTags before merging this PR.

Copy link
Contributor Author

@afkelsall afkelsall Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not actually a new file (renamed and git thinks it's new). The tests remain mostly the same as they were but added a nested inner class to show the different approaches in action.

Comment on lines +37 to +39
fun wasTriggeredBy(throwable: Throwable): Boolean {
return triggeringThrowable == throwable
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to handle the case where an original exception was caught and wrapped up in a new exception? We could do that by recursively checking against the .cause

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great call out!

// caught and handled by this `withSmartTags`, it should clean up the unused and unneeded context.
SmartTagsThreadLocalHandler.clear()
}
} catch (th: Throwable) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching Throwable always raises an eyebrow for me, as it includes things like OutOfMemoryError and ThreadDeath that need careful treatment. It might be ok here since this block is just doing a little bit of work and then rethrowing anyway, but unless you're sure you need to handle those scenarios, I would suggest catching Exception instead

Copy link
Contributor Author

@afkelsall afkelsall Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this catching Exception rather than throwable if that's better here. To explain how I got this though I was originally looking through the WebAction Exception interceptor code which was catching Throwable and I did the same thing for this.

I'd be interested to understand the reasoning for using Throwable in this situation and whether that same logic holds up for where I'm using it as well.

Arguably with the interceptor code, there's potentially more risk around OOM and ThreadDeath exceptions because it's trying to respond to the client as well as log in these situations.

Comment on lines +186 to +187
// This will be cleared upon logging of the exception within misk or if the thrown exception
// is handled by a higher level `withSmartTags`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if someone uses WithSmartTags in a place that never pops them automatically (say, in a Kafka consumer)? Could tags leak from one exceptional call to another if they share a thread?

Copy link
Contributor Author

@afkelsall afkelsall Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The situation where stale tags are left in ThreadLocal storage is not straight forward to cover all bases but I think it's as robust as it needs to be. The scenarios I've thought of:

  1. The only way to retrieve the tags and read them is destructive through SmartTagsThreadLocalHandler.popThreadLocalSmartTags(). This is what misk calls as well as any other process wanting to integrate with this functionality. This clears the storage. The data class use to read it is private to the hander object.
  2. If a withSmartTags() block exits naturally, the ThreadLocal storage is cleared straight away. This covers the scenario where calls to withSmartTags are nested and the inner one threw an exception which the outer one caught and handled.
  3. If an inner withSmartTags() nesting throws and the outer one catches it, but then later throws something else, the addOrClearTags taking the thrown exception and will work out that it should overwrite the existing ThreadLocal tags with its own because what's in there was no set by the current exception.
  4. If for any reason the tags in ThreadLocal are somehow stale, given they are based on the exception thrown, it will simply overwrite what's there if it needs to using the same logic as above in (3).

The scenario you've highlighted will very much leave stale ThreadLocal tags lying around which isn't ideal but is pretty low risk. The possible scenario I can think of:

  • Kafka consumer uses withSmartTags blocks and throws exception setting ThreadLocal
  • Never pops the tags to log them with the exception (not built - yet, but could be any process that's not integrated)
  • The same thread is then used for a WebAction incoming event (are the thread pools shared between different things like WebActions, Kafka consumers, sqs consumers and tasks?)
  • The incoming event on the WebAction enters the try/catch block with the popThreadLocalSmartTags in the catch but throws an exception BEFORE the first withSmartTags block which would have overwritten or cleared the ThreadLocal when throwing or finishing
  • The exception message is logged with the stale tags

This should be unlikely, especially if misk threadpools aren't shared.

Thinking over this, any try/catch block integrated with this functionality could clear ThreadLocal before entering the called code for example:

  override fun intercept(chain: NetworkChain) {
    SmartTagsThreadLocalHandler.clear() // <<<<<<<<<<<-------------- Clear tags before starting
    try {
      chain.proceed(chain.httpCall)
    } catch (th: Throwable) {
      try {
        val mdcTags = SmartTagsThreadLocalHandler.popThreadLocalSmartTags()
        // logs tags

This would ensure that if there are any tags in ThreadLocal when processing an exception, they should not be stale. Would need to do this any time a process uses this functionality. Misk only has two, one for WebActions and one for the SqsJobConsumer.

* Log MDC context: [processValue: "PV_123", contextToken: "contextTokenValue"] Log message: "unexpected error dispatching to ServiceAction" // This log would not normally include the MDC context
* ```
*/
fun <T> withSmartTags(vararg tags: Tag, f: () -> T): T {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mark this as @ExperimentalMiskApi

*/
fun <T> withSmartTags(vararg tags: Tag, f: () -> T): T {
// Establish MDC, saving prior MDC
val priorMDC = tags.map { (k, v) ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val priorMDC = tags.map { (k, v) ->
val priorMDC = tags.map { (key, value) ->

Generally for better readability, a proper word is preferred for idiomatic Kotlin.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case you could also just leave it as impliciy it and then the callsites are equivalently readable it.key to it.value.toString()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep no problem. This is the code what was originally present prior to my changes which I've moved around but not modified :)
I agree with the changes though and am happy to push them through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type from MDC is a Pair<String, String> however which means this would read as it.first as key and it.second as value. I'm going to explicitly name them key and value here.

* Log MDC context: [processValue: "PV_123", contextToken: "contextTokenValue"] Log message: "unexpected error dispatching to ServiceAction" // This log would not normally include the MDC context
* ```
*/
fun <T> withSmartTags(vararg tags: Tag, f: () -> T): T {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fun <T> withSmartTags(vararg tags: Tag, f: () -> T): T {
fun <T> withSmartTags(vararg tags: Tag, block: () -> T): T {

Preferred for better readability.

svc-squareup-copybara pushed a commit that referenced this pull request Aug 12, 2024
opinionated. We've reconsidered this approach and
simplified it to just a function call `withSmartTags` doing away
with the stateful builders etc.

COPYBARA_INTEGRATE_REVIEW=#3366 from
GitOrigin-RevId: 508c16fd4046947e33c3e282533841978ef2d1a8
tgregory-block pushed a commit that referenced this pull request Aug 20, 2024
opinionated. We've reconsidered this approach and
simplified it to just a function call `withSmartTags` doing away
with the stateful builders etc.

COPYBARA_INTEGRATE_REVIEW=#3366 from
GitOrigin-RevId: 508c16fd4046947e33c3e282533841978ef2d1a8
@adrw
Copy link
Collaborator

adrw commented Sep 25, 2024

Closing since this was merged in 8cffeac

@adrw adrw closed this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants