Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add the ability to benchmark multiple models concurrently.
This is useful for benchmarking multiple LoRA adapters. - Also fix the latency_throughput_curve.sh to parse non-integer request rate properly. - Also added "errors" to the benchmark results.
- Loading branch information