MLX Llama-Index LLM is a llama-index LLM integration for the MLX machine learning framework. It can be used the same as other llama-index llms to work seamlessy with tools such as RAG.
- Seamless Integration: Easily integrates with the MLX machine learning framework.
- Compatibility: Works with existing llama-index tools and libraries.
- High Performance: Optimized for efficiency and speed.
- Extensible: Easily extendable to add new features or modify existing ones.
git clone https://github.com/yourusername/mlx-llama-index-llm.git
cd mlx-llama-index-llm
pip install -r requirements.txt
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from mlx_lm import load, generate, convert
from mlx_llm import MLXLLM
documents = SimpleDirectoryReader("data").load_data()
# bge embedding model
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
model, tokenizer = load("mlx-community/Meta-Llama-3-8B-8bit")
llm = MLXLLM(model=model, tokenizer=tokenizer)
# ollama
Settings.llm = llm
index = VectorStoreIndex.from_documents(
documents,
)
query_engine = index.as_query_engine()
response = query_engine.query("What is minimax?")
print(response)
- MLX - The MLX library for machine learning on Apple devices
- Llama-Index - The llama-index library that I based my llm off of