Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add the ability to benchmark multiple models concurrently (#850)
* Add the ability to benchmark multiple models concurrently. This is useful for benchmarking multiple LoRA adapters. - Also fix the latency_throughput_curve.sh to parse non-integer request rate properly. - Also added "errors" to the benchmark results. * Re-sample requests for each model
- Loading branch information