Add the ability to benchmark multiple models concurrently. · GoogleCloudPlatform/ai-on-gke@b0be375 · GitHub

Commit

Add the ability to benchmark multiple models concurrently.

Browse files

This is useful for benchmarking multiple LoRA adapters.
- Also fix the latency_throughput_curve.sh to parse non-integer request
  rate properly.
- Also added "errors" to the benchmark results.

Loading branch information

liu-cong committed Oct 17, 2024

1 parent 4fdcca6 commit b0be375

0 comments on commit `b0be375`

Please sign in to comment.