Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cats Effect 3 tracing docs #2128

Merged
merged 3 commits into from
Jul 19, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
267 changes: 267 additions & 0 deletions docs/tracing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
---
id: tracing
title: Tracing
---

## Introduction

Tracing is an advanced feature of `IO` that offers insights into the execution
graph of a fiber. This unlocks a lot of power for developers in the realm of
debugging and introspection, not only in local development environments, but
also in critical production settings.

A notable pain point of working with asynchronous code on the JVM is that stack
traces no longer provide valuable context of the execution path that a program
takes. This limitation is even more pronounced with Scala's `Future`
(pre- 2.13), where an asynchronous boundary is inserted after each operation.
As an inherently asynchronous data type, `IO` suffers a similar problem, but
even the stack trace of an `IO` program executed completely synchronously is
poluted with the implementation methods of the `IO` run-loop, trace information
which is unnecessary and confusing for the end-user application.

`IO` solves this problem by collecting a stack trace at various `IO` operations
that a fiber executes during its lifetime, and knitting them together to produce
a more coherent view of the fiber's execution path.

However, fiber tracing isn't limited to just collecting and assembling
meaningful stack traces. In fact, the first release of tracing support for the
Cats Effect 3 release comes with the following features:
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved

1. **Asynchronous stack tracing**. This is essentially what is described above,
where stack frames are collected across asynchronous boundaries for a given
fiber.
2. **Constructor/combinator inference**. Stack traces can be walked to determine
what specific `IO` constructor or combinator was actually called by user code.
For example, `void` and `as` are combinators that are derived from `map`, and
should appear in the fiber trace rather than `map`. The same goes for `IO`
constructors such as `async`, which is a complex composition of many lower level
implementation details, none of which are present in the final stack trace, as
they are not relevant to the user.
3. **JVM and GC safe trace information cache**. The trace information caching
mechanism of Cats Effect 3 has been completely rewritten to take advantage of
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved
the `java.lang.ClassValue` platform APIs, a JVM-native caching solution with
several appealing properties:
- Class-loader safety -- for the first time ever, long-running Cats Effect
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved
applications fully respect dynamic class-loading environments, promptly
releasing references to unloaded classes when instructed by the JVM runtime,
thus helping with GC.
- Lock-free caching data structures natively optimized by the JIT compiler
for maximum performance. During the development of the cached asynchronous
stack tracing feature for Cats Effect 3, it was measured that the switch to
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved
the JVM platform APIs resulted in a 5-10% (application dependent) performance
improvement compared to the previous tracing cache implementation.
4. **Improvements to the full stack tracing performance**. With Cats Effect 3
being a complete code rewrite of the project as a whole, we were able to revisit
and improve on specific details of the previous tracing implementation.
The development cycle of the asynchronous tracing feature took slightly longer
than initially planned, mostly because it was preceded by a significant
experimentation period. During this experimentation period, we were able to
drastically optimize the process of stack tracing information collection, which
now operates much more closely to the JVM potential. This resulted in up to 5x
improvements compared to the previous collection mechanism. As impressive as
these numbers are, the reality is that real-world applications with cached
asynchronous stack tracing enabled (which is the default mode of operation) will
not see meaningful performance improvements from this change, simply because the
collection of traces before caching them is only a tiny chunk of the lifecycle
of a real long-running process. However, we feel that this is still a worthwhile
achievement, as uncached (full) stack tracing is much more performant, which
will result in a noticeably improved developer experience when debugging `IO`
programs.
5. **Augmented exceptions**. Exceptions arising in user code are caught by the
Cats Effect runtime, stripped of unnecessary internal implementation details and
augmented using asynchronous stack tracing information gathered at runtime, such
as information about traced `IO` constructors and combinators, before finally
being presented to the user.

### General note on stack tracing performance

As a note of caution, fiber tracing generally introduces overhead to
applications in the form of higher CPU usage, memory and GC pressure. Always
remember to performance test your specific application with tracing enabled
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved
before deploying it to a production environment!

## Asynchronous stack tracing

### Configuration
The stack tracing mode of an application is configured by the system property
`cats.effect.stackTracingMode`. There are 3 stack tracing modes: `none`,
`cached` and `full`. These values are case-insensitive. The default tracing mode
is `cached`, which uses a global cache to avoid expensive recomputation of stack
tracing information.

To prevent unbounded memory usage, stack tracing information for a given fiber
is accumulated in an internal buffer as execution proceeds. If more traces are
collected than the buffer can retain, the oldest trace information will be
overwritten. The default size of the buffer is 16. This value can be configured
via the system property `cats.effect.traceBufferLogSize`. Keep in mind that the
provided value is a logarithm of the actual size of the buffer (i.e. the actual
size of the tracing information buffer is
2<sup>`cats.effect.traceBufferLogSize`</sup>).

For example, to enable full stack tracing and a trace buffer of size 1024,
specify the following system properties:
```
-Dcats.effect.stackTracingMode=full -Dcats.effect.traceBufferLogSize=10
```

## Stack tracing modes

### `none`
No tracing is instrumented by the program and thus incurs no performance impact.
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved

### `cached`
When cached stack tracing is enabled, a stack trace is captured and cached for
each `map`, `flatMap`, `async`, etc. call in a program.

_Caution_: The stack trace cache is indexed by the lambda class reference, so
cached tracing may produce inaccurate fiber traces in these circumstances:
1. Monad transformer composition
2. When a named function is supplied at multiple call-sites

The performance impact of cached stack tracing was measured to be less than 30%
in synthetic benchmarks for tightly looping `IO` programs consisting of a
combination of both synchronous and asynchronous combinators (in Cats Effect 3,
all `IO` programs are asynchronous in nature). The actual performance impact of
cached tracing will be much lower for any program that performs any sort of I/O.
Again, we strongly recommend benchmarking applications that make use of tracing,
in order to avoid surprises and to get a realistic picture of the specific
impact on your application.

This is the recommended stack tracing mode to run in most production
applications and is enabled by default.
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved

### `full`
As with cached stack tracing, when full stack tracing is enabled, a stack trace
is captured for most `IO` combinators and constructors.

The difference comes in the fact that stack traces are collected on
*every invocation*, so naturally, most programs will experience a significant
performance hit. This mode is mainly useful for debugging in development
environments.

## Enhanced exceptions
The stack trace of an exception caught by the Cats Effect 3 `IO` runtime looks
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved
similar to the following:
```
Exception in thread "main" java.lang.Throwable: Boom!
at Example$.$anonfun$program$5(Main.scala:23)
at cats.effect.IO.$anonfun$ifM$1(IO.scala:409)
at cats.effect.IOFiber.runLoop(IOFiber.scala:383)
at cats.effect.IOFiber.execR(IOFiber.scala:1126)
at cats.effect.IOFiber.run(IOFiber.scala:125)
at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:359)
```

It includes stack frames that are part of the `IO` runtime, which are generally
not of interest to application developers. When asynchronous stack tracing is
enabled, the `IO` runtime is capable of augmenting the stack traces of caught
exceptions to include frames from the asynchronous stack traces. For example,
the augmented version of the stack trace from above looks like the following:
```
Exception in thread "main" java.lang.Throwable: Boom!
at Example$.$anonfun$program$5(Main.scala:25)
at apply @ Example$.$anonfun$program$3(Main.scala:24)
at ifM @ Example$.$anonfun$program$3(Main.scala:25)
at map @ Example$.$anonfun$program$3(Main.scala:24)
at apply @ Example$.$anonfun$program$1(Main.scala:23)
at flatMap @ Example$.$anonfun$program$1(Main.scala:23)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at apply @ Example$.fib(Main.scala:13)
at flatMap @ Example$.fib(Main.scala:13)
at flatMap @ Example$.program(Main.scala:22)
at run$ @ Example$.run(Main.scala:10)
```

Note that the relevant stack frames from the call-site of the user code are
preserved, but all `IO` runtime related stack frames are replaced with async
stack frame traces.

The example above shows that a lot of information can be retained even for
deeply nested recursive `IO` programs like the one on this page, even with a
fairly small buffer. The example was generated using the following stack tracing
configuration:
```
-Dcats.effect.stackTracingMode=full -Dcats.effect.traceBufferLogSize=6
```

The enhanced exceptions feature is controlled by the system property
`cats.effect.enhancedExceptions`. It is enabled by default.

It can be disabled with the following configuration:
```
-Dcats.effect.enhancedExceptions=false
```

### Complete code
Here is the code snippet that was used to generate the above examples:
```scala
vasilmkd marked this conversation as resolved.
Show resolved Hide resolved
// Pass the following system property to your JVM:
// -Dcats.effect.stackTracingMode=full
// -Dcats.effect.traceBufferLogSize=6

import cats.effect.{ExitCode, IO, IOApp}
import cats.syntax.all._

import scala.util.Random

object Example extends IOApp.Simple {

def fib(n: Int, a: Long = 0, b: Long = 1): IO[Long] =
IO(a + b).flatMap { b2 =>
if (n > 0)
fib(n - 1, b, b2)
else
IO.pure(b2)
}

def program: IO[Unit] =
for {
x <- fib(20)
_ <- IO(println(s"The 20th fibonacci number is $x"))
_ <- IO(Random.nextBoolean())
.ifM(IO.raiseError(new Throwable("Boom!")), IO.unit)
} yield ()

override def run: IO[Unit] =
program
}
```