Add the ability to benchmark multiple models concurrently (#850) · GoogleCloudPlatform/ai-on-gke@a3401f2 · GitHub

Commit

Add the ability to benchmark multiple models concurrently (#850)

Browse files

* Add the ability to benchmark multiple models concurrently.
This is useful for benchmarking multiple LoRA adapters.
- Also fix the latency_throughput_curve.sh to parse non-integer request
  rate properly.
- Also added "errors" to the benchmark results.

* Re-sample requests for each model

Loading branch information

liu-cong authored Oct 23, 2024

1 parent 8d48829 commit a3401f2

0 comments on commit `a3401f2`

Please sign in to comment.