v2.0.5: Llava multi-card support
Llava multi-card support
- Only Apply the TP in language_model #219 @yuanwu2017
- Llava-next: Added flash_attention_recompute option #220 @tthakkal
- Update README.md with changes related to LLava-next multi card support #221 @tthakkal
- Upgrade to Optimum Habana v1.13.2 #222 @regisss
Tested models and configurations
Model | BF16 | FP8 | Single Card | Multi-Cards |
---|---|---|---|---|
Llama2-7B | ✔ | ✔ | ✔ | ✔ |
Llama2-70B | ✔ | ✔ | ✔ | |
Llama3-8B | ✔ | ✔ | ✔ | ✔ |
Llama3-70B | ✔ | ✔ | ✔ | |
Llama3.1-8B | ✔ | ✔ | ✔ | ✔ |
Llama3.1-70B | ✔ | ✔ | ✔ | |
CodeLlama-13B | ✔ | ✔ | ✔ | |
Mixtral-8x7B | ✔ | ✔ | ✔ | ✔ |
Mistral-7B | ✔ | ✔ | ✔ | ✔ |
Llava-v1.6-Mistral-7B | ✔ | ✔ | ✔ | ✔ |
Full Changelog: v2.0.4...v2.0.5