📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
sora
llm
llms
vllm
llm-inference
awesome-llm
flash-attention
flash-attention-2
tensorrt-llm
paged-attention
deepseek
open-sora
flash-attention-3
-
Updated
Nov 1, 2024