Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a nice way of configuring the compute pool in IOApp #1238

Closed
djspiewak opened this issue Sep 24, 2020 · 12 comments · Fixed by #1740
Closed

Find a nice way of configuring the compute pool in IOApp #1238

djspiewak opened this issue Sep 24, 2020 · 12 comments · Fixed by #1740
Milestone

Comments

@djspiewak
Copy link
Member

Some applications will perform better with slightly different thread counts within the compute pool. Notably, @viktorklang has discussed how, in many of his tests, JVM applications behave better when the compute pool runs closer to 70% of available cores rather than the 100% we default to. Whether or not this is the case for any given application depends on a lot of factors, including how much data is being pushed through the caches, how good the cache utilization is, how much allocation is being done, and the exact implementation of hyperthreading available (or unavailable) on the host system. All in all, we can't really detect these factors, so it's better for us to, in the end, make it configurable.

This would be similar to the ce2 IOApp.WithContext, but maybe a little nicer. Note that, in order to do this properly, we're going to need to adjust createDefaultComputeThreadPool somewhat so that users can actually construct a work-stealing pool with a non-default thread count.

@djspiewak djspiewak added the CE 3 label Sep 24, 2020
@djspiewak
Copy link
Member Author

Related to #1227

@Daenyth
Copy link
Contributor

Daenyth commented Dec 31, 2020

I'd also request that we need some way to do more overriding to the pool than just counts. For example, it's very important for me at least that the pool has mxbeans exposed for metric collection. If those aren't built in, I'd need to swap in a pool that had them

@djspiewak
Copy link
Member Author

@Daenyth Would you mind opening an issue for what you would like to see exposed in the mxbeans?

@djspiewak
Copy link
Member Author

So one thought here is that IORuntime.global doesn't need to be a lazy val. We can actually make it backed by a var which is only mutable within the cats.effect package. Then, if global is dereferenced before that var is populated, we create the IORuntime that we create today. However, if IOApp starts up before global is dereferenced, then we can set it to be whatever we have configured within IOApp. This gives us the ability to do nice things, like have protected def runtimeConfig: IORuntimeConfig that users can override to control the configuration of the global pool when using IOApp.

@viktorklang
Copy link
Contributor

@djspiewak It'd still need to be @volatile to ensure visibility of modifications (i.e. establish HB-edges). Also, it is likely that you'll risk having check-then-act problems, necessitating a critical section(s) with mutual exclusion, which introduces the risk of deadlocks. In general, using deferencing as a decision-mechanism has not been a great experience for me in concurrency settings.

@djspiewak
Copy link
Member Author

@viktorklang It would definitely be race-condition-y, but I think that's solvable, particularly since this is one of those things that only happens once per JVM. I definitely agree with you that dereference-based decisioning is usually awful but, because this is a thing that happens right at the beginning of the JVM startup, it might be okay.

The alternative is pretty bad, tbh. Like either we don't have the ability to configure the global runtime outside of system properties (i.e. the situation we have right now), or we have to split the IOApp runtime from the global runtime, which raises the very real and probable risk that many systems will end up with multiple thread pools unexpectedly.

@viktorklang
Copy link
Contributor

@djspiewak Indeed. I had to go through the same kind of process when I worked on the global ExecutionContext for Scala—which is why it is only configurable using System Properties. I guess you could pull off some ClassLoader magic if one really wanted to, but that seemed like a step too far.

@djspiewak
Copy link
Member Author

In this case, I think we have one lever that you didn't have with global, which is that we're (usually) in control of the main method. So given that we know and control the very first entry point into the program, I think it should be possible to unambiguously switch between the two scenarios (we controlled main vs someone else controlling main and accessing global out of band).

@viktorklang
Copy link
Contributor

@djspiewak If you control main you could probably install a ThreadLocal to identify that you were initialized via the main method? (As opposed to having to do stack-walking to figure that out).

@djspiewak
Copy link
Member Author

@viktorklang Oh actually I was thinking of something sneakier than that!

object IORuntime {

  @volatile
  private[effect] var _global: IORuntime = null

  lazy val global: IORuntime = {
    // this, but you know, atomic and stuff
    if (_global == null)
      _global = new IORuntime(System.getProperty("config.option"))
    else
      _global
  }
}

trait IOApp {

  protected val ConfigOption = System.getProperty("configOption")

  def main(args: Array[String]): Unit = {
    val runtime = new IORuntime(ConfigOption)
    IORuntime._global = runtime

    // ...
  }
}

With this setup, users can override val ConfigOption in their applications or use the system property and get the effect they expect. I didn't bother writing out the atomic variant of the lazy val initialization logic, but you get the idea. Basically this is exploiting the fact that no one is going to access global before we do if someone is using IOApp, which means we can get in before the normal lazy initialization and provide different semantics.

@viktorklang
Copy link
Contributor

@djspiewak I'd probably make IORuntime._global something more akin to IORuntime.installGlobal(…) so you can have it raise an error or similar if that invariant is violated (i.e. if global was initialized ahead of _global). Could save a chunk of headache in the future.

@djspiewak
Copy link
Member Author

Agreed. That's much nicer than directly mutating a var

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants