Skip to content

anyscale/batch-llm-inference-reproductions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Instructions

  1. Launch the batch inference template
image1
  1. Update the Head Node Type to the desired machine type
image2
Model Node Type
neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8 g6e.12xlarge
neuralmagic/Meta-Llama-3.1-7B-Instruct-FP8 g6e.xlarge
  1. Run the following script on the template:
bash run_70b.sh 
# bash run_8b.sh

About

Reproducing Batch LLM Inference performance numbers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published