Skip to content

2.2.1 Backend: Ollama

av edited this page Sep 30, 2024 · 4 revisions

Handle: ollama URL: http://localhost:33821

ollama

Official Ollama logo

Ergonomic wrapper around llama.cpp with plenty of QoL features.

Ollama is connected directly to the Open WebUI as the main LLM backend.

Starting

Ollama is one of the default services, so you don't need to specify anything special to start it.

harbor up

See harbor defaults on managing default services.

Models

You can discover new models via Ollama's model library.

Management of the models is possible right from the Open WebUI Admin Settings. The models are stored in the global ollama cache on your local machine.

Alternatively, you can use ollama CLI itself.

# Show the list of available models
harbor ollama list

# Pull a new model
harbor ollama pull phi3

More generally, you can use a full ollama CLI, when the corresponding service is running.

# Ollama service should be running to access the cli
harbor ollama --help
# See the envrionment variables
# supported by ollama service
harbor ollama serve --help

# Access Ollama CLI commands
harbor ollama version

Configuration

You can specify Ollama's environment variables (run harbor ollama serve --help for reference) in the .env and docker-compose.ollama.yml files.

# Configure ollama version, accepts a docker tag
harbor config set ollama.version 0.3.7-rc5-rocm

API

Retreive the endpoint for ollama service with:

harbor url ollama

Additionally, you can find a small HTTP playbook in the http-catalog folder.

Importing models

A sample workflow to import a model from HuggingFace repository with gguf files.

# 1. Download the model
harbor hf download flowaicom/Flow-Judge-v0.1-GGUF

# 2. Locate the gguf file
# The gguf file is located in the model directory
h find Flow-Judge-v0.1 | grep .gguf
# /home/user/.cache/huggingface/hub/models--flowaicom--Flow-Judge-v0.1-GGUF/snapshots/3ca...575/flow-judge-v0.1-Q4_K_M.gguf

# 3. Translate the path
# Harbor mounts HF cache to Ollama service
# /home/user/.cache/huggingface -> /root/.cache/huggingface
# The path becomes:
# /root/.cache/huggingface/hub/models--flowaicom--Flow-Judge-v0.1-GGUF/snapshots/3ca...575/flow-judge-v0.1-Q4_K_M.gguf

# 4. Create a modelfile
# You can use any convenient folder to store modelfiles
# By default, Harbor has a directory for modelfiles: ollama/modelfiles
# Below are few _options_ on quickly accessing the directory
harbor vscode       # Open Harbor workspace in VS Code and go from there
open $(harbor home) # Open Harbor workspace in default file manager
open $(harbor home)/ollama/modelfiles # This is the directory for modelfiles
code $(harbor home)/ollama/modelfiles # Open the directory in VS Code

# 5. Sample modelfile contents
# TIP: Use original base modelfile as a reference:
#      harbor ollama show --modelfile <model name>
# Save as "<your name>.Modelfile" in the modelfiles directory
FROM /root/.cache/huggingface/hub/models--flowaicom--Flow-Judge-v0.1-GGUF/snapshots/3ca...575/flow-judge-v0.1-Q4_K_M.gguf

# 6. Create the model
# 6.1. From Harbor's modelfiles directory
harbor ollama create -f /modelfiles/<your name>.Modelfile <your name>
# 6.2. From current directory
harbor ollama create -f ./<your name>.Modelfile <your name>
# Successfull output example
# 13:27:37 [INFO] Service ollama is running. Executing command...
# transferring model data 100%
# using existing layer sha256:939...815
# creating new layer sha256:aaa...169
# writing manifest
# success

# 7. Check the model
harbor ollama run <your name>

# 8. To upload to ollama.com, follow official tutorial
# on sharing the models:
# https://github.com/ollama/ollama/blob/main/docs/import.md#sharing-your-model-on-ollamacom
Clone this wiki locally