[Bug] vLLM server fails when requests with different temperatures are sent #29

cglagovichTT · 2024-10-29T15:47:06Z

Anything you want to discuss about vllm.

You can repro by starting up the server example and sending requests with different temperatures. Failure should look like

    async for i, res in result_generator:
  File "/home/cglagovich/vllm/vllm/utils.py", line 506, in merge_async_iterators
    item = await d
  File "/home/cglagovich/vllm/vllm/engine/multiprocessing/client.py", line 598, in _process_request
    raise request_output
  File "/home/cglagovich/vllm/vllm/engine/multiprocessing/client.py", line 598, in _process_request
    raise request_output
AssertionError: Currently only supporting same temperature for all sequences in batch```

The text was updated successfully, but these errors were encountered:

cglagovichTT assigned skhorasganiTT Oct 29, 2024

tstescoTT added the bug Something isn't working label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] vLLM server fails when requests with different temperatures are sent #29

[Bug] vLLM server fails when requests with different temperatures are sent #29

cglagovichTT commented Oct 29, 2024 •

edited

Loading

[Bug] vLLM server fails when requests with different temperatures are sent #29

[Bug] vLLM server fails when requests with different temperatures are sent #29

Comments

cglagovichTT commented Oct 29, 2024 • edited Loading

Anything you want to discuss about vllm.

cglagovichTT commented Oct 29, 2024 •

edited

Loading