Skip to content
Mark Diekhans edited this page Feb 28, 2020 · 31 revisions

Maintainance:

  • Release on the 1st Wednesday of each month
    • markd: does train leave the station model still make sense for Toil?

Toil Roadmap Near-Term

  • More efficient kubernetes support
    • One of:
      • Host-path caching
      • Toil-integrated within-pod scheduler
  • More robust Kubernetes support
    • Handle Kubernetes communication timeouts without restarting the leader
  • Rework SingleMachineBarchSystem to eliminate thread limit exhaustion issues
  • CWL 1.1+ Support
    • 13 failing CWL 1.1 conformance tests currently
    • Conditional support (CWL 1.2)

Toil Roadmap Long-Term

  • Move away from mesos before/as Ubuntu 16.04 goes out of support
    • probably in favor of auto-deployed Kubernetes somehow
  • Running in-house VG WDL workflows
  • WDL compliance test suite
  • Increase test coverage
  • Automatic idle worker termination and fixes to ignored nodes.
  • Updates on caching (should we enable by default?).
  • Incorporate a cactus integration test to better support cactus
  • Improved ease of debugging

Stretch Goals:

  • Google Support
  • Batch System Support
  • Update boto libraries to boto3
  • Move from simpleDB to a better supported service
  • More scalable Kubernetes support * moving to watches * handling more pods in queue than we can loop over before our continue tokens expire
  • Restart/recovery improvements * Changing CWL parameters * Managing a failed task that cannot be recovered in a large pipeline * Checkpointing
  • Better support for heterogeneous tasks (e.g. customizing disk size per instance type, (maybe) FPGA support for DRAGEN).
  • AWS custom/multi security group support
  • AWS multi-zone support