How to run LLaMa inference? #243

SinanAkkoyun · 2023-05-09T02:07:08Z

SinanAkkoyun
May 9, 2023

Hello!
How is it possible to run LLaMa models with the great FP8 inference speedup?

Would one need to train a new LLM from scratch or is it possible to convert existing models with the same accuracy?

Thank you very much and thank you for all the awesome work!

SinanAkkoyun · 2023-05-09T02:11:10Z

0 replies