Skip to content

Releases: typelevel/cats-effect

v3.0.0-RC1

14 Feb 00:02
v3.0.0-RC1
129ca14
Compare
Choose a tag to compare
v3.0.0-RC1 Pre-release
Pre-release

Cats Effect 3.0.0-RC1 is the first release candidate for 3.0.0. It is expected that one or two more release candidates will be required prior to the final release of 3.0.0, but none are specifically planned at this time. If everything goes perfectly and the ecosystem does not uncover problems, the next release will be 3.0.0 itself.

To that end, we will be attempting to maintain backward compatibility within the 3.0.0-RC lineage leading up to 3.0.0 final. This is not a guarantee and we are not enabling MiMa checks, but the intention is that any further release candidates will be binary compatible with RC1 so as to ease the burden on the ecosystem. No further changes are expected to the API, only additions.

What follows are the changes from M5 to RC1. For a more complete summary of the changes generally included in Cats Effect 3.0.0, please see the M1, M2, M3, M4, and M5 release notes.

Major Changes

Async Instance for Resource

One of the original motivating use-cases behind some of the early thinking around Cats Effect 3 was the following: assume you have two Resources, race them such that the winner is produced and remains open while the loser is cleanly finalized. As it turns out, this semantic is not possible on Cats Effect 2 (without manual bespoke implementation work using allocated), and this fact is honestly quite limiting.

While it is not entirely difficult to implement Resource#race as a special-cased thing, the truly general solution to this limitation, and all similar limitations, is to define Async[Resource[F, *]] given Async[F], which in turn also means defining mechanisms for concurrency in terms of Resource itself. This is a surprisingly tricky problem, since defining the meaning of Fiber in the presence of dynamically-scoped finalizers is not trivial.

The original proposals for Cats Effect 3 involved a proposed typeclass, Region, which was intended to address this issue. In particular, Region defined what it meant to form a dynamic monadic region, in much the same way as Bracket defined what it meant to form a static monadic region. Ultimately, it was decided to remove Region for several reasons, the first of which being that it introduced a considerable amount of complexity into the hierarchy and stood in the way of a number of usability goals.

But even more than usability (which is important!), it was convincingly argued that Region was in fact redundant: every Region can be encoded in terms of an underlying Bracket, and Bracket in turn can be encoded in terms of a Region which is opened and then immediately closed. Thus, Region offered no theoretical expressive power beyond Bracket, which in turn could then be made non-primitive (due to the removal of Region), paving the way for the current uncancelable/onCase/handleErrorWith encoding which exists in the system today.

This argument though implied something very strong: that it was possible to implement the entire typeclass hierarchy for any dynamic monadic region. Now that Async has been implemented for Resource, we have much stronger evidence that this is the case.

The exact semantics of Async[Resource] are subtle in a few ways that are worth calling out:

  • Any scopes that are opened with an async block are maintained after the continuation of the async. This may be a little unintuitive, but it's the safest default given that resources allocated during the registration of an asynchronous action may be critical for the validity of the results once produced.
  • When a resource is started, its finalizers become part of the outer scope. This is to ensure that finalization is run even when the fiber is not joined. There are arguably at least three reasonable semantics which could have been chosen in this area, and in the end, the tie-breaker was that original motivating use case: raceing two Resources and closing the loser.
  • onCancel is very different from Resource.make in that it inserts a finalization boundary which is not extended by flatMap. However, despite this it can still be rendered within Resource through the use of applyFull and allocated, which allow for the encoding of arbitrary scope boundary semantics. (credit to @RaasAhsan for this insight)

As a final note, it is worth calling out the fact that the memoize function, which is available on any Concurrent, does not behave on Resource in the way you might intuitively expect. In particular, if you use the inner Resource which arises from the memoization, the finalizers will be run and the memoized value (which can be re-accessed) may then be invalidated. This is an unavoidable consequence of the lack of linearity in Scala's type system. Ultimately, it is use itself which is unsafe; in its absence, memoize actually behaves as you would expect. use is the operation which cannot be appropriately alias-constrained due to these limitations in Scala, and thus it is to be expected that there are some outsized consequences which cannot be definitively prevented.

Configurable Root Cancelation Scope in MonadCancel

One of the interesting things about cancelation in Cats Effect is that it is a hint. This is somewhat fundamental to the model, which attempts to preserve both safety and preemption, but it is nevertheless interesting and it has some major impacts on the nature of the APIs we can implement. One such impact is the fact that MonadCancel[F].canceled produces an F[Unit], not an F[Nothing], which might be more intuitive. The reason for this is the fact that even self-cancelation is not guaranteed to be respected, since we might be nested within an uncancelable block. Similarly Fiber#cancel is not guaranteed to be immediately respected, only eventually respected if the fiber is not permanently blocked within an uncancelable region.

One perhaps-surprising implication of this is that it is technically possible to validly implement MonadCancel without implementing any cancelation support! This is because it is always safe to ignore the cancelation hint and pretend to be permanently uncancelable. There is still some expressiveness implied by MonadCancel which is more powerful than just MonadError (namely, forceR), but this is relatively narrow in scope.

In this release, we made the decision to allow this "optionality" of MonadCancel to be more directly reflected in the runtime, in the form of the MonadCancel[F].rootCancelScope value. This had a number of helpful consequences, most notable of which the fact that Sync is able to extend MonadCancel without forcing implementations such as SyncIO to implement cancelation without fibers. This in turn is very helpful because it allows users and frameworks to write code which is safe in the presence of cancelation, but which does not assume cancelability.

Splitting these semantics apart generalizes MonadCancel just enough that these kinds of cases are possible, and we avoid ugly pathologies like F[_]: MonadCancel: Sync, which was not uncommon in many corners of the ecosystem prior to this change.

Added the Unique Typeclass

At first glance, Unique appears to be a relatively ad-hoc class to add into the hierarchy. It defines a single method, unique, which produces an F[Unique.Token], where Hash[Unique.Token] is defined. It has only one law, which defines that all sequencings of unique must produce differentiably unique tokens. In other words, arbitrary unique identity.

Again, this seems like a relatively arbitrary thing to add into the hierarchy until you dig into it a little more deeply. For starters, uniquely differentiable values imply some relatively strong things about the effect type, F. It must not memoize, and if the values are differentiable using the JVM's own inherent notion of identity (which is the fastest method for achieving this property in general), it must have some notion of lazy evaluation. These are relatively stringent properties, and they are in fact must stronger than even Defer.

What is also interesting is the fact that Spawn already implies that F must have these properties. In particular due to the fact that Fiber instances are unique by definition: starting multiple times results in independent Fibers, all of which have independent cancel and join methods, and so on. Spawn certainly implies a number of things that are stronger than just the properties necessary to represent uniqueness, but it's at least something in that direction.

This, combined with the fact that uniqueness is a property of the JVM and JavaScript runtimes is relatively strong evidence that this is something that, like Clock, should be reflected in the abstractions. Unique fills that niche, and in so doing, enables the kinds of patterns that were already necessary within libraries like Vault and Fs2, now natively within the Cats Effect runtime calculus.

User-Facing Pull Requests

Read more

v3.0.0-M5

22 Dec 18:11
v3.0.0-M5
Compare
Choose a tag to compare
v3.0.0-M5 Pre-release
Pre-release

Cats Effect 3.0.0-M5 is the fourth milestone release in the 3.0.0 line. We are expecting to follow this milestone with subsequent ones as necessary to further refine functionality and features, prior to some number of release candidates leading to a final release. The designation "M5" is meant to indicate our relative confidence in the functionality and API of the library, but it by no means indicates completeness or final compatibility. Neither binary compatibility nor source compatibility with anything are assured prior to 3.0.0, though major breakage is unlikely at this point.

What follows are the changes from M4 to M5. For a more complete summary of the changes generally included in Cats Effect 3.0.0, please see the M1, M2, M3, and M4 release notes.

Major Changes

Supervisor

In practice, it is very common to need more control over fiber lifecycles than the basic combination of start, join, and cancel. While these are nice primitives, it is very definitely worth building higher level functionality on top which can allow for greater power and expressiveness. background is a very simple example of such a combinator, but we're beginning to explore the space of even richer abstractions.

Supervisor is one such tool. The target scenario here is when it is necessary to spawn a fiber within some scope which is managed and controlled separately. This is useful for two things. First, Supervisor guarantees that when it is canceled, all supervised fibers will also be canceled. Second, any fibers started by the Supervisor will be located within the origin scope of the Supervisor itself, which gives them access to the then-current ExecutionContext as well as any other lexically scoped elements of the effect (such as if Kleisli is in use). One example of where this is useful is in a server receiving incoming client connections and needing to spawn fibers to manage those connections.

Resource#makeFull

The make constructor is responsible for creating a Resource given an effect which allocates a resource of type A, along with a function which disposes of that resource when the scope is exited. This is analogous to the bracket function, which behaves similarly. With CE3, bracket itself has been generalized considerably, and is itself a derived combinator implemented in terms of uncancelable and onCancel. This has given rise to richer combinators such as bracketFull, which passes a Poll to its acquisition effect.

The addition of the Poll to resource acquisition allows the composition of richer resource semantics which internally require semantic blocking. The classic example of this is Semaphore acquisition, which is logically a resource that must be released when finished, but where the acquisition of that resource may require blocking (if no permits are available). Safely interruptible resource acquisition is exactly the problem that Poll was designed to solve. With the addition of the makeFull constructor on Resource, we now have the ability to define such interruptible acquisition resources as regions rather than simply brackets. For example, acquiring a Semaphore permit as a Resource is now an operation which can be defined in user-space:

Resource.eval(Semaphore[IO](1)) flatMap { s =>
  // ...
  Resource.makeFull(_(s.acquire))(_ => s.release)  // => Resource[IO, Unit]
}

When the inner Resource in the above is closed, the Semaphore permit will be released. If the permit is unavailable when the acquire is run, the fiber will semantically block, but that blocking will be interruptible and will not prevent cancellation.

The one downside to this change is the requisite addition of a tighter set of constraints on mapK. Whereas previously this function was unconstrained on both F and G (the input and output effects), the implementation must now internally use uncancelable from the MonadCancel typeclass, meaning that it is now constrained on MonadCancel[F, _] and MonadCancel[G, _].

Eliminated Effect Covariance on Resource

One subtler change introduced in this milestone is a revision to the Resource type signature. It is now defined as the following:

sealed abstract class Resource[F[_], +A]

Previously, the F[_] was defined as +F[_], which made it somewhat easier to interoperate with polyfunctor polymorphic interfaces leveraging subtyping. Unfortunately, this encoding appears to be generally impossible to implement without unsound operations and casting, which is quite a bad sign. It has also historically surfaced serious bugs in Scala 2, even leading to downstream encodings that are unsound.

For these and several other reasons, the decision was made to eliminate the variance on the effect itself. The A remains (rightly) covariant and does not cause any of the same issues, due to the transitive Functor[F] constraint on interpretation.

Ongoing Scheduler Performance Improvements

Not satisfied with a roughly 2x - 5x performance improvement on fiber scheduling overhead under contention, Vasil has been working hard on improving the IO fiber scheduling implementation even further. This is an active area of research, where we are currently experimenting not just with performance enhancements but also with algorithms which can detect bugs related to blocking on the compute pool and appropriately surface those problems to users, making it easier to squash hard-to-detect issues before they hit production.

This is all ongoing work, but in the meantime the performance has continued to inch forward. At the present time, in our benchmarks, the IO work-stealing scheduler is almost exactly 11x faster under contention than a naive scheduler written against a fixed thread pool (i.e. the CE2 scheduler). Needless to say, we're looking forward to seeing this in production applications!

Pull Requests

You're all amazing, thank you!

v2.3.1

18 Dec 18:04
v2.3.1
Compare
Choose a tag to compare

This is the ninth major release in the Cats Effect 2.x lineage. It is fully binary compatible with all 2.x.y releases.

The Scala 3 release train continues onward! The primary new feature in this release is simply a cross-publication for Scala 3.0.0-M3. Support for 3.0.0-M1 has been dropped.

User-Facing Pull Requests

Special thanks to each and every one of you!

v3.0.0-M4

27 Nov 23:06
v3.0.0-M4
0edddb5
Compare
Choose a tag to compare
v3.0.0-M4 Pre-release
Pre-release

Cats Effect 3.0.0-M4 is the fourth milestone release in the 3.0.0 line. We are expecting to follow this milestone with subsequent ones as necessary to further refine functionality and features, prior to some number of release candidates leading to a final release. The designation "M4" is meant to indicate our relative confidence in the functionality and API of the library, but it by no means indicates completeness or final compatibility. Neither binary compatibility nor source compatibility with anything are assured prior to 3.0.0, though major breakage is unlikely at this point.

What follows are the changes from M3 to M4. For a more complete summary of the changes generally included in Cats Effect 3.0.0, please see the M1, M2, and M3 release notes.

Major Changes

Support for ARM

With the release of Apple's M1 desktop processor (based on the ARM architecture), as well as the continued push towards the Amazon Graviton architecture within AWS, ARM support has become very much a non-optional feature of any major runtime. Unfortunately, despite running on the JVM, Cats Effect is sufficiently low-level that it doesn't just get this "for free".

In particular, Cats Effect takes advantage of a number of memory-related tricks to drastically improve performance within the implementation of IO. Unfortunately, x86_64 and ARM64 have significantly different memory models, with x86 providing much stricter guarantees than ARM. As it turned out, Cats Effect 3 was accidentally exploiting these stricter guarantees, meaning that programs written using IO which were run on the ARM platform would sometimes nondeterministically deadlock!

As you can imagine, this was a particularly insane bug to track down. Originally identified by @vasilmkd, it resulted in very long nights agonizing over various EC2 instances, as well as a lot of backchannel discussions with experts within the industry to try to narrow down exactly what is going on. There's a long and interesting story here which will eventually become a conference talk and maybe a series of blog posts.

Long story short… Very, very special thanks to @RaasAhsan, who spent a long night devoting his full attention to minimizing the issue from "the entire IO implementation" all the way down to just ~80 lines of Java and two threads. For a nondeterministic memory-related bug which is also CPU architecture-specific, this has to stand as one of the most impressive bug minimization efforts I've seen in my entire career.

Once the bug was minimized, @simonis and @mo-beck very graciously and thoroughly explained exactly what was happening under the surface, complete with snippets of assembly and discussions of various semantic guarantees. In the end, the fix was a single line change swapping compareAndSwap for getAndSet, a brilliant conception for which the credit is entirely owed to @viktorklang.

At the end of the day, the code is now faster (on both x86 and ARM!) and deterministic on all compliant ARM JVMs.

(as an aside, GitHub Actions really needs to hurry up and add support for ephemeral self-hosted runners so that we can add ARM jobs to our CI matrix)

Clock No Longer Extends Applicative

This was a rather annoying foot-gun in the API which reared its ugly head in code like this:

def foo[F[_]: Monad: Clock](f: F[Int]) = 
  f.map(_ + 2)    // error!

The above would not compile due to ambiguities between Monad (which provides map) and Clock (which also provided map). This is a similar situation to Monad and Traverse within the Cats library, though in this case, it is possible to provide a resolution.

Clock no longer extends Applicative, meaning that it no longer implicitly materializes an Applicative instance whenever it is in scope. However, it still contains an Applicative[F] instance within its definition, meaning that it requires Applicative without materializing it. This is the same trick used by Cats MTL, originally explored by @aloiscochard as part of the Scato project.

In practice, this should have relatively little impact on user code aside from removing a source of ambiguity and frustration.

Standard Library Intensifies...

This release saw a significant acceleration of standard library features. In particular:

  • CountDownLatch
  • Deque
  • CyclicBarrier
  • parTraverseN/parSequenceN

Much of this work was done by the tireless @TimWSpence, who is also responsible for the comprehensive support for inductive monad transformer instances up and down the hierarchy! We are continuing to improve and enhance the std module leading up to the 3.0 release, and we expect to continue adding enhancements even after 3.0 is finalized.

Pull Requests

You're all amazing, thank you!

v2.3.0

27 Nov 21:55
v2.3.0
fd8bb96
Compare
Choose a tag to compare

This is the eighth major release in the Cats Effect 2.x lineage. It is fully binary compatible with all 2.x.y releases.

Unlike Cats Effect 2.2.0, which was a massive release with a ton of changes and improvements, Cats Effect 2.3.0 is a relatively modest release with only a few changes. As we continue to move towards Cats Effect 3.0, we are shifting efforts on Cats Effect 2 towards long term stability and maintenance, with more of the innovation pushed into the newer release lineage.

With that said, this is the first major Cats Effect 2 release which supports Scala 3! More specifically, this release has been published for both Scala 3.0.0-M1 and M2. We will continue to track the latest two Scala 3 milestones and release candidates until such time as Scala 3.0.0 final is released.

Notable New Features

Scala 3!

The most exciting new feature in this release is first-class support for Scala 3! It has long been possible to use Cats Effect 2 from projects compiled against Scala 3 (and Dotty before it) using the withDottyCompat utility. However, this would very quickly run against tricky corner cases, particularly with things like Resource. Thanks to the hard work of @vasilmkd, @bplommer, and @mpilquist, Cats Effect 2.3.0 now fully supports Scala 3 for all use-cases, including ScalaJS.

Furthermore, it is expected that source compatibility on downstream code has been preserved between Scala 2.13 and Scala 3.0, meaning that migrating to Scala 3 should be as easy as swapping the scalaVersion in your build definition. Cats Effect 3 has been building on Dotty for about six months now, and we're very excited to bring this functionality to the Cats Effect 2 lineage.

Print Non-Daemon Threads on Exit

This probably seems like a relatively trivial feature, but it actually solves a really annoying problem that was otherwise difficult to track down. The issue resolved here is when exiting an IOApp which, at some point, spawned Threads in user-code, and where those Threads are not setDaemon(true). If these spawned threads are still running when IOApp exits, the result is that the exit hangs while waiting for the threads to terminate. Particularly if you don't know what you're looking at, this can be quite an annoying problem to track down.

In an attempt to help out with this (unfortunately relatively common) scenario, IOApp will now detect active non-daemon threads when it reaches its exit point and print them to stderr, giving you a chance to track down the offenders and correct the situation, rather than just silently hanging.

User-Facing Pull Requests

Special thanks to each and every one of you! Also extra-special thanks to @bplommer who did a ton of work on Resource, both in Cats Effect 2 and on the Cats Effect 3 development branch, to improve the safety and maintainability of the internal representation.

v3.0.0-M3

08 Nov 22:52
v3.0.0-M3
Compare
Choose a tag to compare
v3.0.0-M3 Pre-release
Pre-release

Cats Effect 3.0.0-M3 is the third milestone release in the 3.0.0 line. We are expecting to follow this milestone with subsequent ones as necessary to further refine functionality and features, prior to some number of release candidates leading to a final release. The designation "M3" is meant to indicate our relative confidence in the functionality and API of the library, but it by no means indicates completeness or final compatibility. Neither binary compatibility nor source compatibility with anything are assured prior to 3.0.0, though major breakage is unlikely at this point.

What follows are the changes from M2 to M3. For a more complete summary of the changes generally included in Cats Effect 3.0.0, please see the M1 and M2 release notes.

Major Changes

Error Trapping

One of the "skeletons in the closet" for nearly all thread pools is fatal error trapping. There are a whole series of exceptions which can be raised by the JVM which represent varying levels of serious problems (most famously, OutOfMemoryError). These exceptions are correctly ignored by IO and similar runtime systems, but unfortunately that doesn't really solve the problem for the user.

Consider a scenario wherein an IO-using application runs out of memory. When this happens, the error will be raised by the VM, which in turn will result in the carrier thread being killed and the active fiber deadlocking. Unfortunately, without further intervention, that's all that will happen. Other fibers can continue executing, and the VM may even try to recover from the error (which is actually unsound).

Similar things happen with link errors, stack overflows, and even thread interruption (which frequently occurs when using sbt run). In all of these cases, the carrier thread is gone, at least one fiber is deadlocked, and the rest of the application has no idea. If you're lucky the fatal exception will have its stack trace printed, but it's easy to miss this in the logs, meaning that zombie processes are actually quite a common consequence of this situation.

Unfortunately, this is an incredibly difficult problem to solve because fatal errors, by definition, cannot be reliably handled. The VM is often in an undefined state and even basic things like allocation or virtual dispatch may not behave normally. Thus, the runtime needs to achieve a full shutdown and propagation of the error across threads, but without actually doing anything fancy.

The solution we've chosen here is to take the following steps any time a fatal error is trapped:

  1. Print the stack trace
  2. Trigger a full shutdown of the runtime thread pool
  3. Asynchronously propagate the error back to the original unsafeRun... call site

In most cases, this means that the error will be relocated back to the original unsafeRun call site, which in turn will mean that errors will shut down the containing application and avoid the zombie scenario. In a worst-case, the entire application will be halted and the error will at least be printed.

This entirely resolves the "zombie process" problem, even in the face of an OutOfMemoryError.

Respecialized racePair

racePair can be implemented entirely in terms of Deferred and Fiber, meaning that it does not need to be a primitive in the IO algebra. This is actually relatively significant since, as it turns out, the implementation in terms of Deferred and Fiber is faster than the "primitive" as it had been implemented in IO itself. This in turn increases the performance of Parallel and related operations.

Pull Requests

You're all amazing, thank you!

v2.3.0-M1

07 Nov 23:03
v2.3.0-M1
Compare
Choose a tag to compare
v2.3.0-M1 Pre-release
Pre-release

This is the first milestone release in the 2.3.x lineage, and the very first version of Cats Effect 2 to be published for Scala 3! Specifically, this release contains cross-builds for Dotty 0.27.0-RC1 and Scala 3.0.0-M1. We will continue tracking the latest two Scala 3 milestones until final upstream release. This is in addition to similar efforts on the Cats Effect 3 front. Exciting times!

Additional changes in this release:

Also a special shout-out to @vasilmkd for his tireless work on infrastructural issues during this release timeline!

Thank you everyone for all your hard work!

v3.0.0-M2

23 Oct 03:50
v3.0.0-M2
b3f28a3
Compare
Choose a tag to compare
v3.0.0-M2 Pre-release
Pre-release

Cats Effect 3.0.0-M2 is the second milestone release in the 3.0.0 line. We are expecting to follow this milestone with subsequent ones as necessary to further refine functionality and features, prior to some number of release candidates leading to a final release. The designation "M2" is meant to indicate our relative confidence in the functionality and API of the library, but it by no means indicates completeness or final compatibility. Neither binary compatibility nor source compatibility with anything are assured prior to 3.0.0, though major breakage is unlikely at this point.

What follows are the changes from M1 to M2. For a more complete summary of the changes generally included in Cats Effect 3.0.0, please see the M1 release notes

Major Changes

The most significant changes in this release are undoubtedly the introduction of auto-yielding semantics in fiber evaluation and the final removal of UnsafeRun and related introduction of Dispatcher.

Auto-Yielding

This feature is probably best demonstrated with an example test case:

"reliably cancel infinite IO.unit(s)" in real {
  IO.unit.foreverM.start.flatMap(f => IO.sleep(50.millis) >> f.cancel).as(ok)
}

This test constructs a fiber which loops forever doing absolutely nothing. Specifically, IO.unit.foreverM. This is the functional equivalent of while (true) {}. It then starts this fiber, sleeps for 50 milliseconds, then cancels it.

Unfortunately, this test will hang forever on any single-threaded runtime! The reason for this is the fact that one fiber is hogging the only available thread in such a runtime, and so the sleep is never able to wake up and cancel it. Exemplifying this is the fact that this exact unit test had to be disabled on JavaScript until this change was introduced.

When one fiber is able to hog a thread, preventing other fibers from getting their turn, it is known as a violation of fairness. Most applications written in the Cats Effect ecosystem have experienced this problem in a minor way at some point or another, though it isn't commonly observed in production due to the fact that most Cats Effect usage involves a lot of incidental yielding. A yield, codified by the IO.cede operation, is when the current fiber gives up its carrier thread so that other fibers can take their turn before the current fiber resumes again. However, when fibers, intentionally or otherwise, failed to consistently yield their thread back to the scheduler, they could starve other fibers of resources. The service-level consequence of this problem (when it manifests) is a significantly increased latency jitter metric: responses are still produced very quickly once they get started, but some responses take a significant length of time before they even begin executing.

Auto-yielding resolves this issue entirely. This semantic has long been resisted within Cats Effect's IO due to the fact that auto-yielding fundamentally sacrifices throughput (intuitively, straight-line performance of any single fiber) for fairness (reliable latency). This is often a desirable tradeoff to make, but seldom one which can be automatically determined by the runtime system. A poorly-timed automatic yield can result in significant performance costs for an application which are hard to detect and almost impossible to resolve.

However, the introduction of the work-stealing scheduler in Cats Effect 3 opens up the possibility of automatic yields without any performance penalty! Simply put, when a fiber yields back to the scheduler in Cats Effect 3, it only incurs a cost if some other fiber was already waiting on that specific thread. In other words, in all cases where the automatic yield was unnecessary, the performance penalty of the automatic yield is non-existent: no memory barriers are crossed, and no threads are switched, thus all cache lines are preserved and the fiber continues as if nothing happened. If, however, the automatic yield does find another fiber awaiting access to the thread, then the yield is having its desired result and the other fiber needed to take a turn! Thus, we have all of the benefits of automatic yielding with none of the drawbacks.

Dispatcher

Cats Effect 2 included a pair of typeclasses, ConcurrentEffect and Effect, which were designed to abstract over the notion of running an effect. This was necessary due to the existence of frameworks like Netty, which requires API users to submit tasks which the framework (Netty) itself will run as side-effects at some later point. Other common frameworks in this mould include Play, Akka, Java NIO, and many more.

This scenario is remarkably common, even in higher-level frameworks such as Http4s. Unfortunately, Cats Effect 3 made this considerably more complicated by improving the ergonomics around the async operation. Since the async operation now automatically shifts back to the compute thread pool, that thread pool must be passed to the IO running functions (e.g. unsafeRunSync()) in the form of a runtime parameter. In the case of Cats Effect 3, this is represented by the IORuntime type (and automatically managed for you if you use IOApp), but Monix and ZIO both have their own variation of this idea. Unfortunately (or perhaps, fortunately), this also means that typeclasses like ConcurrentEffect are no longer possible.

In 3.0.0-M1, Cats Effect relied upon the UnsafeRun typeclass to "abstract" over this use-case. This was deeply unpleasant and had a lot of unfortunate consequences on API design, but there didn't seem to be a better approach.

Dispatcher is a better approach. For any situation where you're using Cats Effect code to manage some sort of impure framework which itself needs to run other Cats Effect actions as side-effects, Dispatcher provides a high-performance and fully generic solution which only requires an Async constraint to operate. This allows frameworks such as Fs2, Http4s, streamz, and many others to loosen their constraints dramatically, which opens up the door to significantly more powerful abstraction and higher levels of control for users.

Critically, Dispatcher does not provide a mechanism for running an effect "at the end of the world." For such situations, you still need to use IOApp (or whatever similar functionality is provided for your effect type of choice). What Dispatcher does do is ensure that only one "end of the world" is ever required, which greatly improves the ergonomics, flexibility, and performance of the framework and ecosystem.

Pull Requests

  • #1278 Stack safety for Resource (@bplommer)
  • #1283 ApplicativeThrow type alias (similar to MonadThrow) (@fthomas)
  • #1285, #1296 Removed constraints from most Resource functions (@bplommer)
  • #1157, #1295 Special-case IO.race and IO.both for performance (@TimWSpence)
  • #1282 Tests for Console.readLine including all sorts of wonky characters (@vasilmkd)
  • #1281 Simplified Semaphore API by removing withPermit (@bplommer)
  • #1307 Added Resource#onFinalize (@bplommer)
  • #1308 Added IO.ref and IO.deferred functions (@bplommer)
  • #1303, #1333 Added Dispatcher, a new construct for interacting with code which must run F actions as a side-effect, including common frameworks like Play or Netty (@djspiewak, @RaasAhsan)
  • #1303 Removed UnsafeRun, the last vestiges of ConcurrentEffect, since it is no longer necessary to "abstract over the end of the world" (@djspiewak)
  • #1326 Ported HotSwap from fs2 (@RaasAhsan)
  • #1294 Added a high performance asynchronous blocking queue (@vasilmkd)
  • #1346 Fixed Outcome smart constructors (@heka1024)
  • #1316 Relaxed Semaphore typeclass constraints by eagerly raising errors on invalid parameters (@bplommer)
  • #1340 Implemented fiber auto-yielding to ensure fair-by-default fiber evaluation semantics. Note that this should have almost no performance penalty due to the presence of the work-stealing scheduler (@TimWSpence)
  • So… much… scaladoc (@RaasAhsan)

You're all amazing, thank you!

v3.0.0-M1

07 Oct 03:35
v3.0.0-M1
Compare
Choose a tag to compare
v3.0.0-M1 Pre-release
Pre-release

Cats Effect 3.0.0-M1 is the very first milestone release in the 3.0.0 line. We are expecting to follow this milestone with subsequent ones as necessary to further refine functionality and features, prior to some number of release candidates leading to a final release. The designation "M1" is meant to indicate our relative confidence in the functionality and API of the library, but it by no means indicates completeness or final compatibility. Neither binary compatibility nor source compatibility with anything are assured prior to 3.0.0, though major breakage is unlikely at this point.

Acknowledgements

Normally, these go at the end. In this case, they deserve to be up front and center. So many people jumped in to contribute to this project, particularly over the past several months. It's impossible to properly everyone, and inevitably some names get left off of any list, but it's important to try.

Cats Effect 3 has been a group effort from beginning to end. In particular, the following individuals have been completely indispensable:

  • Raas Ahsan
  • Jakub Kozłowski
  • Fabio Labella
  • Michael Pilquist
  • Ben Plommer
  • Tim Spence
  • Vasil Vasilev

Take a bow, all of you. You deserve it.

Overview

Cats Effect 3 is a complete redesign of the Cats Effect library, starting from first principles and building up from there. It includes:

  • A reorganized, safer, and easier-to-understand typeclass hierarchy
  • A brand new IO implementation that is safer, more convenient, and dramatically faster
  • Redesigned cancellation and evaluation semantics to improve composability
  • Completely novel fiber scheduling implementations on both the JVM and JavaScript
  • ...and much more!

Typeclasses

The hierarchy has been revamped from top to bottom. Several major design principles behind this redesign:

  • Safety
  • Composability
  • Orthogonality
  • Convenience

Those who are familiar with the Cats Effect 2 hierarchy will notice several differences immediately. Bracket is gone, and in its place is MonadCancel, which defines behavior in terms of several lower-level and more powerful operations. bracket itself is still present, but more general and more composable across different datatypes.

Spawn and Concurrent are now nearly at the top of the hierarchy, whereas in Cats Effect 2, Concurrent sits under Async. This properly reflects the fact that concurrency is a generalization of control flow. Just as Monad characterizes sequential control flow, Spawn characterizes control flow in which sequential fibers can fork and later rejoin, and Concurrent characterizes control flow patterns in which fibers can pass data between each other and even create cycles.

Temporal and Clock replace the old Cats Effect 2 Clock and Timer classes, which sat awkwardly outside of the main hierarchy. It is no longer necessary to write functions like the following:

def useful[F[_]: Concurrent: Timer: ContextShift](...)

Instead, you would just write:

def useful[F[_]: Temporal](...)

Speaking of which, ContextShift is gone, never to return. Its existence was necessitated quite indirectly by the fact that the async function scheduled its continuation on the thread which invoked its callback. This seemingly-minor implementation detail has profound implications for the entire library, and in turn gives rise to what is far-and-away the most common "Cats Effect gotcha": forgetting to shift after async. In Cats Effect 3, it is no longer possible to "forget" to shift back to the appropriate thread pool, and it is no longer necessary to manually propagate confusing implicit instances like ContextShift[F] or even ContextShift[IO].

Finally, Sync and Async sit at the very bottom of the hierarchy, reflecting the fact that they are the most powerful typeclasses in existence: they allow users to capture arbitrary side-effects. You can encode any concept you want with these abstractions, which is why we no longer pretend that there are things which are somehow more powerful (such as Concurrent in Cats Effect 2).

In practice, this should result in a hierarchy which is much cleaner and safer for users. It will be easier to tell, at a glance, what a function can or cannot do, since things that are Concurrent are not actually capable of capturing side-effects.

The operations provided by these classes are dramatically simpler, defined in terms of a small set of easy-to-understand primitives. It's always much harder to understand a game with many rules, and Cats Effect 3 is designed to be a game with very few, very powerful rules. To that end, all of the typeclass laws have been completely rewritten, reflecting their simplicity and power. As an example, the law which defines how cancellation behaves is simply the following:

  def canceledAssociatesLeftOverFlatMap[A](fa: F[A]) =
    F.canceled >> fa.void <-> F.canceled

The above can be read as: if a fiber is cancelled, then all subsequent actions are short-circuited and discarded.

One part of this redesign involves a brand new operation, uncancelable, which makes it possible to composably define complex resource management scenarios in which certain parts of a critical region must be interruptible (such as atomically acquiring and releasing a semaphore, where the acquisition may block the fiber). The design of this operation (and the various tradeoffs involved) is explored in depth by Fabio Labella on this excellent issue.

In reflection of their fundamental nature, the typeclass abstractions have been entirely split from IO and now live in a separate, more fundamental module: kernel. Third-party datatype implementations (such as Monix Task) no longer need to rely on Cats Effect IO as part of their fundamental calculus, and end-users who choose to use such datatypes can now reliably have a single datatype on their classpath, rather than two.

IO

The IO monad in Cats Effect 3 has been completely rewritten from the ground up. While it superficially behaves the same as the old one in many circumstances, the implementation is much safer and more compact. Significant emphasis has been placed on user experience, discoverability, and pleasantries. A simple program:

import cats.effect._
import cats.syntax.all._

import scala.concurrent.duration._

object Hello extends IOApp.Simple {

  def greeter(word: String): IO[Unit] =
    (IO.println(word) >> IO.sleep(100.millis)).foreverM

  val run =
    for {
      english <- greeter("Hello").start
      french <- greeter("Bonjour").start
      spanish <- greeter("Hola").start

      _ <- IO.sleep(5.seconds)
      _ <- english.cancel >> french.cancel >> spanish.cancel
    } yield ()
}

This program runs identically on the JVM with Scala 2.12, 2.13, and the latest Dotty milestone, 0.27.0-RC1. It similarly runs on JavaScript with the same ScalaJS versions.

This program superficially resembles the Cats Effect 2 equivalent, but under the surface things are dramatically different. A brand new, fiber-aware, work-stealing thread pool is included as part of Cats Effect 3. This scheduler results in dramatic performance benefits relative to Cats Effect 2 when the application is under heavy, concurrent load: around 1.5x - 5x faster in conservative benchmarks. Unlike microbenchmark improvements to flatMap and map, these are real gains which will show up in service-level performance metrics in production. This scheduler also makes it possible to implement auto-yielding semantics (a form of weakly-preemptive multitasking) with literally zero performance penalty, even ignoring amortization.

Even better-still, this scheduler is fiber-aware, meaning that it understands the common patterns found in Cats Effect programs and optimizes accordingly. This can be a difficult performance benefit to measure reliably, so it's probably better if you just try it in your application and see the benefits for yourself. In general, the results should include things like considerably better cache locality and dramatically lower context shift overhead, particularly as load increases. The scheduler performs best when your application is fully saturated at capacity.

Of course, flatMap and map are also quite a bit faster than in Cats Effect 2, clocking in around 2x - 3x faster, depending on the test. This has very little impact on most real-world applications, but it's still nice to know.

JavaScript also received a huge amount of attention in this rewrite. During the course of its development, Cats Effect IO's test suite revealed an unexpected limitation in the ExecutionContext instance used by nearly all ScalaJS applications: it doesn't correctly yield control to the macrotask event queue. This surprising fact means that simple applications like the following are broken in Cats Effect 2, and their equivalents are broken in nearly every other ScalaJS application:

// cats effect 3 api
IO.cede.foreverM.start.flatMap(f => IO.sleep(5.seconds) >> f.cancel)

This program spawns a fiber which loops forever, always yielding control back to the event queue on each iteration. The main fiber then waits five seconds before canceling this loop and shutting down.

In Cats Effect 2, and in most ScalaJS applications, this program runs forever because the ExecutionContext simply never yields and the timer is never able to fire. Cats Effect 3 fixes this problem with a brand new JavaScript fiber scheduler. This scheduler runs with extremely high performance on NodeJS and all modern browsers, with polyfills to ensure fallback compatibility with all platforms on which ScalaJS itself is s...

Read more

v2.2.0

07 Sep 20:04
v2.2.0
Compare
Choose a tag to compare

This is the seventh major release in the Cats Effect 2.x lineage. It is fully binary compatible with all 2.x.y releases. As previously noted, this is the first Cats Effect release to be exclusively published for ScalaJS 1.x; there are no 0.6.x cross-releases.

This is a huge release packed full with a large number of new features (most visibly, high-performance tracing for the IO monad!) and a massive number of bugfixes. It's impossible to distill this immense effort down to just a few small thank-yous and change descriptions, but I still want to make special mention of @RaasAhsan, who worked tirelessly to bring rich, asynchronous tracing to IO, on top of a host of other major aspects of this release.

Notable New Features

Asynchronous Tracing

IO tracing is probably the single most-requested Cats Effect feature, and it's finally here! IO now has a basic, high-performance asynchronous tracing mechanism. Note that this mechanism has very different internals from the async tracing already available in Akka and ZIO, and thus comes with a different set of tradeoffs. Also please remember that no async tracing on the JVM is perfectly accurate, and you may see misleading trace frames depending on your program. Despite all this, the added information is very welcome and sometimes quite helpful in tracking down issues!

This is what it looks like:

IOTrace: 19 frames captured
 ├ flatMap @ org.simpleapp.examples.Main$.program (Main.scala:53)
 ├ map @ org.simpleapp.examples.Main$.foo (Main.scala:46)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:45)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:44)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:43)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:42)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:41)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:40)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:39)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:38)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:37)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:36)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:35)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:34)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:33)
 ├ flatMap @ org.simpleapp.examples.Main$.foo (Main.scala:32)
 ├ flatMap @ org.simpleapp.examples.Main$.program (Main.scala:53)
 ╰ ... (3 frames omitted)

The number of actions to retain in the trace can be configured by the system property cats.effect.traceBufferLogSize, which defaults to 4 (representing a default buffer size of 16 frames). Backtraces like the above can be accessed using the IO.traced function. In addition to this, and (in practice) even more usefully, we have added support for enriched exceptions. As an example:

java.lang.Throwable: A runtime exception has occurred
  at org.simpleapp.examples.Main$.b(Main.scala:28)
  at org.simpleapp.examples.Main$.a(Main.scala:25)
  at org.simpleapp.examples.Main$.$anonfun$foo$11(Main.scala:37)
  at map @ org.simpleapp.examples.Main$.$anonfun$foo$10(Main.scala:37)
  at flatMap @ org.simpleapp.examples.Main$.$anonfun$foo$8(Main.scala:36)
  at flatMap @ org.simpleapp.examples.Main$.$anonfun$foo$6(Main.scala:35)
  at flatMap @ org.simpleapp.examples.Main$.$anonfun$foo$4(Main.scala:34)
  at flatMap @ org.simpleapp.examples.Main$.$anonfun$foo$2(Main.scala:33)
  at flatMap @ org.simpleapp.examples.Main$.foo(Main.scala:32)
  at flatMap @ org.simpleapp.examples.Main$.program(Main.scala:42)
  at as @ org.simpleapp.examples.Main$.run(Main.scala:48)
  at main$ @ org.simpleapp.examples.Main$.main(Main.scala:22)

Any time an exception is caught within the IO mechanism (either via automatic try/catch, or via raiseError), the stack trace is inspected to determine whether or not a prefix of that stack trace is simply implementation details of the IO runloop itself. If that is the case, those frames are removed from the stack and replaced with a synthetic stack trace based on the asynchronous tracing information, as shown above.

Note that this feature may interact incorrectly with some poorly-written parsing logic. For example, a log parser tool which uses regular expressions to extract stack trace information, where those regular expressions are intolerant of spaces. If you wish to disable enhanced exceptions specifically without disabling tracing, simply set the cats.effect.enhancedExceptions system property to false.

It's important to understand that this feature is very new and we have a lot of things we want to improve about it! We need user feedback to guide this process. Please file issues and/or chat us up in Gitter if you have ideas, and especially if you see problems!

Tracing has three modes:

  • Disabled
  • Cached (the default)
  • Full

These are all governed by the cats.effect.stackTracingMode system property and are global to the JVM. As a rough performance reference, at present, Cached tracing imposes a roughly 10% performance penalty in synthetic benchmarks on asynchronous IOs. This difference should be entirely impossible to observe outside of microbenchmarks (though we would love to hear otherwise if you see evidence to the contrary!). Full tracing imposes an exceptionally high performance cost, and is expected to be used only in development environments when specifically attempting to track down bugs. Disabled tracing imposes an extremely negligible penalty, and should be used in production if Cached tracing imposes a noticeable performance hit. It is recommended that you stick with the default tracing mode in most cases.

Cached tracing produces a single linear trace of the actions an IO program takes. This tracing mode uses heuristics to determine call-site information on functions like flatMap and async, and those heuristics can be misleading, particularly when used with monad transformers or types like Resource or Stream. If you have ideas for how to improve these heuristics, please let us know!

Full tracing captures a full JVM stack-trace for every call into IO, which results in an extremely comprehensive picture of your asynchronous control flow. This imposes a significant performance cost, but it makes it possible to see through complex compositions such as monad transformers or third-party code. This is an appropriate mode for development when the heuristics which generate a Cached trace are insufficient or misleading.

We want your feedback! There are so many different things that we can do with this functionality, and we want your opinions on how it should work and how it can be improved. We have many more features coming, but please file issues, talk to us, try it out, tweet angrily, you know the drill.

Bracket Forwarders

This is a rather subtle issue, but on 2.1.x (and earlier), it is impossible to write something like the following:

def foo[F[_]: Sync] =
  Bracket[OptionT[F, *], Throwable].bracket(...)

The Bracket instance would fail to resolve, despite the fact that there's a Sync in scope, since Bracket cannot be inductively defined (see #860 for some discussion of the issues surrounding inductive Bracket instances). This change adds implicit forwarders to the Bracket companion object so that the inductive Sync instances, defined on the Sync companion object, could be resolved via the Bracket companion. Basically, this just moves the Sync instances into a scope wherein they can be addressed by implicit search starting from either Sync or Bracket.

There is a very small chance this will cause implicit ambiguities in some code bases. We are not aware of any specific cases where this can happen, but it's worth keeping in mind as you upgrade.

Performance Improvements in Deferred

The Deferred primitive is one of the most powerful and most-used utilities in the Cats Effect library outside of IO itself. It is, for example, the foundation of most of fs2's concurrency mechanisms. This release contains significant performance improvements within this abstraction, with a particular focus on lowering memory pressure. This should result in a very significant, user-visible jump in performance across most real-world applications.

User-Facing Pull Requests

Read more