Skip to content

v.1.2

Compare
Choose a tag to compare
@AlexBulankou AlexBulankou released this 11 Jun 15:05
· 135 commits to main since this release
218cc62

Quick start solutions

Ray

  • Enabled TPU webhook on GKE Autopilot (#585)
  • Support Multi-slice TPU groups (#453)
  • Support multiple worker groups requesting TPUs (#467)
  • Added unit tests for Ray TPU webhook (#578)
  • Fix GMP on GKE Standard (#689)

RAG

  • README updates to fix broken links (#664) and better describe custom domain usage (#681)

ML Platform

  • intended for platform admins to have a multi-tenant AI/ML platform running on GKE)
  • Initial release! (#690)

TPU Provisioner

  • Add fixes relating to interacting with JobSets (#645)
  • Allow forcing use of on-demand nodes and disable auto upgrade for node pools (#656)
  • Support location hint label (#666)
  • Update usage instructions (#684)

Benchmarking

  • Support for Measuring Time to First Token (#650)
  • Allow private clusters (#669)
  • Fix hardcoded gRPC request type (#670)
  • Add JetStream support (#677)
  • Support for vLLM OpenAI API Server (#694)