Merge pull request #2 from orangewise/azure_main

pull in upstream
simonw · Sep 24, 2024 · 62d2e20 · 62d2e20
2 parents 6b5b850 + 36b7d93
commit 62d2e20
Show file tree

Hide file tree

Showing 9 changed files with 140 additions and 49 deletions.
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -1,5 +1,16 @@
 # Changelog
 
+(v0_16)=
+## 0.16 (2024-09-12)
+
+- OpenAI models now use the internal `self.get_key()` mechanism, which means they can be used from Python code in a way that will pick up keys that have been configured using `llm keys set` or the `OPENAI_API_KEY` environment variable. [#552](https://github.com/simonw/llm/issues/552). This code now works correctly:
+    ```python
+    import llm
+    print(llm.get_model("gpt-4o-mini").prompt("hi"))
+    ```
+- New documented API methods: `llm.get_default_model()`, `llm.set_default_model(alias)`, `llm.get_default_embedding_model(alias)`, `llm.set_default_embedding_model()`. [#553](https://github.com/simonw/llm/issues/553)
+- Support for OpenAI's new [o1 family](https://openai.com/o1/) of preview models, `llm -m o1-preview "prompt"` and `llm -m o1-mini "prompt"`. These models are currently only available to [tier 5](https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-five) OpenAI API users, though this may change in the future. [#570](https://github.com/simonw/llm/issues/570)
+
 (v0_15)=
 ## 0.15 (2024-07-18)
 
@@ -177,7 +188,7 @@ To create embeddings for every JPEG in a directory stored in a `photos` collecti
 llm install llm-clip
 llm embed-multi photos --files photos/ '*.jpg' --binary -m clip
 ```
-Now you can search for photos of racoons using:
+Now you can search for photos of raccoons using:
 ```
 llm similar photos -c 'raccoon'
 ```

diff --git a/docs/plugins/tutorial-model-plugin.md b/docs/plugins/tutorial-model-plugin.md
@@ -135,7 +135,7 @@ We can try that out by pasting it into the interactive Python interpreter and ru
 
 To execute the model, we start with a word. We look at the options for words that might come next and pick one of those at random. Then we repeat that process until we have produced the desired number of output words.
 
-Some words might not have any following words from our training sentence. For our implementation we wil fall back on picking a random word from our collection.
+Some words might not have any following words from our training sentence. For our implementation we will fall back on picking a random word from our collection.
 
 We will implement this as a [Python generator](https://realpython.com/introduction-to-python-generators/), using the yield keyword to produce each token:
 ```python

diff --git a/docs/python-api.md b/docs/python-api.md
@@ -7,22 +7,25 @@ Understanding this API is also important for writing {ref}`plugins`.
 
 ## Basic prompt execution
 
-To run a prompt against the `gpt-3.5-turbo` model, run this:
+To run a prompt against the `gpt-4o-mini` model, run this:
 
 ```python
 import llm
 
-model = llm.get_model("gpt-3.5-turbo")
-model.key = 'YOUR_API_KEY_HERE'
+model = llm.get_model("gpt-4o-mini")
+# Optional, you can configure the key in other ways:
+model.key = "sk-..."
 response = model.prompt("Five surprising names for a pet pelican")
 print(response.text())
 ```
-The `llm.get_model()` function accepts model names or aliases - so `chatgpt` would work here too.
+The `llm.get_model()` function accepts model names or aliases. You can also omit it to use the currently configured default model, which is `gpt-4o-mini` if you have not changed the default.
+
+In this example the key is set by Python code. You can also provide the key using the `OPENAI_API_KEY` environment variable, or use the `llm keys set openai` command to store it in a `keys.json` file, see {ref}`api-keys`.
 
 The `__str__()` method of `response` also returns the text of the response, so you can do this instead:
 
 ```python
-print(response)
+print(llm.get_model().prompt("Five surprising names for a pet pelican"))
 ```
 
 You can run this command to see a list of available models and their aliases:
@@ -52,27 +55,28 @@ response = model.prompt(
 For models that support options (view those with `llm models --options`) you can pass options as keyword arguments to the `.prompt()` method:
 
 ```python
-model = llm.get_model("gpt-3.5-turbo")
-model.key = "... key here ..."
+model = llm.get_model()
 print(model.prompt("Names for otters", temperature=0.2))
 ```
 
 ### Models from plugins
 
-Any models you have installed as plugins will also be available through this mechanism, for example to use Google's PaLM 2 model with [llm-palm](https://github.com/simonw/llm-palm)
+Any models you have installed as plugins will also be available through this mechanism, for example to use Anthropic's Claude 3.5 Sonnet model with [llm-claude-3](https://github.com/simonw/llm-claude-3):
 
 ```bash
-pip install llm-palm
+pip install llm-claude-3
 ```
+Then in your Python code:
 ```python
 import llm
 
-model = llm.get_model("palm")
+model = llm.get_model("claude-3.5-sonnet")
+# Use this if you have not set the key using 'llm keys set claude':
 model.key = 'YOUR_API_KEY_HERE'
 response = model.prompt("Five surprising names for a pet pelican")
 print(response.text())
 ```
-You can omit the `model.key = ` line for models that do not use an API key
+Some models do not use API keys at all.
 
 ## Streaming responses
 
@@ -94,8 +98,7 @@ LLM supports *conversations*, where you ask follow-up questions of a model as pa
 To start a new conversation, use the `model.conversation()` method:
 
 ```python
-model = llm.get_model("gpt-3.5-turbo")
-model.key = 'YOUR_API_KEY_HERE'
+model = llm.get_model()
 conversation = model.conversation()
 ```
 You can then use the `conversation.prompt()` method to execute prompts against this conversation:
@@ -124,7 +127,7 @@ The `llm.set_alias()` function can be used to define a new alias:
 ```python
 import llm
 
-llm.set_alias("turbo", "gpt-3.5-turbo")
+llm.set_alias("mini", "gpt-4o-mini")
 ```
 The second argument can be a model identifier or another alias, in which case that alias will be resolved.
 
@@ -141,3 +144,35 @@ import llm
 
 llm.remove_alias("turbo")
 ```
+
+### set_default_model(alias)
+
+This sets the default model to the given model ID or alias. Any changes to defaults will be persisted in the LLM configuration folder, and will affect all programs using LLM on the system, including the `llm` CLI tool.
+
+```python
+import llm
+
+llm.set_default_model("claude-3.5-sonnet")
+```
+
+### get_default_model()
+
+This returns the currently configured default model, or `gpt-4o-mini` if no default has been set.
+
+```python
+import llm
+
+model_id = llm.get_default_model()
+```
+
+To detect if no default has been set you can use this pattern:
+
+```python
+if llm.get_default_model(default=None) is None:
+    print("No default has been set")
+```
+Here the `default=` parameter specifies the value that should be returned if there is no configured default.
+
+### set_default_embedding_model(alias) and get_default_embedding_model()
+
+These two methods work the same as `set_default_model()` and `get_default_model()` but for the default {ref}`embedding model <embeddings>` instead.
diff --git a/docs/usage.md b/docs/usage.md
@@ -345,6 +345,26 @@ OpenAI Chat: gpt-4o-mini (aliases: 4o-mini)
   logit_bias: dict, str
   seed: int
   json_object: boolean
+OpenAI Chat: o1-preview
+  temperature: float
+  max_tokens: int
+  top_p: float
+  frequency_penalty: float
+  presence_penalty: float
+  stop: str
+  logit_bias: dict, str
+  seed: int
+  json_object: boolean
+OpenAI Chat: o1-mini
+  temperature: float
+  max_tokens: int
+  top_p: float
+  frequency_penalty: float
+  presence_penalty: float
+  stop: str
+  logit_bias: dict, str
+  seed: int
+  json_object: boolean
 OpenAI Completion: gpt-3.5-turbo-instruct (aliases: 3.5-instruct, chatgpt-instruct)
   temperature: float
     What sampling temperature to use, between 0 and 2. Higher values like

diff --git a/llm/__init__.py b/llm/__init__.py
@@ -38,6 +38,7 @@
     "ModelError",
     "NeedsKeyException",
 ]
+DEFAULT_MODEL = "gpt-4o-mini"
 
 
 def get_plugins(all=False):
@@ -144,8 +145,9 @@ class UnknownModelError(KeyError):
     pass
 
 
-def get_model(name):
+def get_model(name: Optional[str] = None) -> Model:
     aliases = get_model_aliases()
+    name = name or get_default_model()
     try:
         return aliases[name]
     except KeyError:
@@ -256,3 +258,27 @@ def cosine_similarity(a, b):
     magnitude_a = sum(x * x for x in a) ** 0.5
     magnitude_b = sum(x * x for x in b) ** 0.5
     return dot_product / (magnitude_a * magnitude_b)
+
+
+def get_default_model(filename="default_model.txt", default=DEFAULT_MODEL):
+    path = user_dir() / filename
+    if path.exists():
+        return path.read_text().strip()
+    else:
+        return default
+
+
+def set_default_model(model, filename="default_model.txt"):
+    path = user_dir() / filename
+    if model is None and path.exists():
+        path.unlink()
+    else:
+        path.write_text(model)
+
+
+def get_default_embedding_model():
+    return get_default_model("default_embedding_model.txt", None)
+
+
+def set_default_embedding_model(model):
+    set_default_model(model, "default_embedding_model.txt")
diff --git a/llm/cli.py b/llm/cli.py
@@ -10,6 +10,8 @@
     Template,
     UnknownModelError,
     encode,
+    get_default_model,
+    get_default_embedding_model,
     get_embedding_models_with_aliases,
     get_embedding_model_aliases,
     get_embedding_model,
@@ -20,6 +22,8 @@
     get_models_with_aliases,
     user_dir,
     set_alias,
+    set_default_model,
+    set_default_embedding_model,
     remove_alias,
 )
 
@@ -41,8 +45,6 @@
 
 warnings.simplefilter("ignore", ResourceWarning)
 
-DEFAULT_MODEL = "gpt-4o-mini"
-
 DEFAULT_TEMPLATE = "prompt: "
 
 
@@ -1574,30 +1576,6 @@ def _truncate_string(s, max_length=100):
     return s
 
 
-def get_default_model(filename="default_model.txt", default=DEFAULT_MODEL):
-    path = user_dir() / filename
-    if path.exists():
-        return path.read_text().strip()
-    else:
-        return default
-
-
-def set_default_model(model, filename="default_model.txt"):
-    path = user_dir() / filename
-    if model is None and path.exists():
-        path.unlink()
-    else:
-        path.write_text(model)
-
-
-def get_default_embedding_model():
-    return get_default_model("default_embedding_model.txt", None)
-
-
-def set_default_embedding_model(model):
-    set_default_model(model, "default_embedding_model.txt")
-
-
 def logs_db_path():
     return user_dir() / "logs.db"
 

diff --git a/llm/default_plugins/openai_models.py b/llm/default_plugins/openai_models.py
@@ -35,6 +35,9 @@ def register_models(register):
     # GPT-4o
     register(Chat("gpt-4o"), aliases=("4o",))
     register(Chat("gpt-4o-mini"), aliases=("4o-mini",))
+    # o1
+    register(Chat("o1-preview", can_stream=False, allows_system_prompt=False))
+    register(Chat("o1-mini", can_stream=False, allows_system_prompt=False))
     # The -instruct completion model
     register(
         Completion("gpt-3.5-turbo-instruct", default_max_tokens=256),
@@ -248,7 +251,6 @@ def validate_logit_bias(cls, logit_bias):
 class Chat(Model):
     needs_key = "openai"
     key_env_var = "OPENAI_API_KEY"
-    can_stream: bool = True
 
     default_max_tokens = None
 
@@ -268,6 +270,8 @@ def __init__(
         api_version=None,
         api_engine=None,
         headers=None,
+        can_stream=True,
+        allows_system_prompt=True,
     ):
         self.model_id = model_id
         self.key = key
@@ -277,12 +281,16 @@ def __init__(
         self.api_version = api_version
         self.api_engine = api_engine
         self.headers = headers
+        self.can_stream = can_stream
+        self.allows_system_prompt = allows_system_prompt
 
     def __str__(self):
         return "OpenAI Chat: {}".format(self.model_id)
 
     def execute(self, prompt, stream, response, conversation=None):
         messages = []
+        if prompt.system and not self.allows_system_prompt:
+            raise NotImplementedError("Model does not support system prompts")
         current_system = None
         if conversation is not None:
             for prev_response in conversation.responses:
@@ -325,7 +333,7 @@ def execute(self, prompt, stream, response, conversation=None):
                 stream=False,
                 **kwargs,
             )
-            response.response_json = remove_dict_none_values(completion.dict())
+            response.response_json = remove_dict_none_values(completion.model_dump())
             yield completion.choices[0].message.content
 
     def get_client(self):
@@ -339,8 +347,7 @@ def get_client(self):
         if self.api_engine:
             kwargs["engine"] = self.api_engine
         if self.needs_key:
-            if self.key:
-                kwargs["api_key"] = self.key
+            kwargs["api_key"] = self.get_key()
         else:
             # OpenAI-compatible models don't need a key, but the
             # openai client library requires one
@@ -415,7 +422,7 @@ def execute(self, prompt, stream, response, conversation=None):
                 stream=False,
                 **kwargs,
             )
-            response.response_json = remove_dict_none_values(completion.dict())
+            response.response_json = remove_dict_none_values(completion.model_dump())
             yield completion.choices[0].text
 
 

diff --git a/setup.py b/setup.py
@@ -1,7 +1,7 @@
 from setuptools import setup, find_packages
 import os
 
-VERSION = "0.15"
+VERSION = "0.16"
 
 
 def get_long_description():

diff --git a/tests/test_llm.py b/tests/test_llm.py
@@ -5,6 +5,7 @@
 from llm.migrations import migrate
 import json
 import os
+import pathlib
 import pytest
 import re
 import sqlite_utils
@@ -556,3 +557,16 @@ def test_llm_user_dir(tmpdir, monkeypatch):
     user_dir2 = llm.user_dir()
     assert user_dir == str(user_dir2)
     assert os.path.exists(user_dir)
+
+
+def test_model_defaults(tmpdir, monkeypatch):
+    user_dir = str(tmpdir / "u")
+    monkeypatch.setenv("LLM_USER_PATH", user_dir)
+    config_path = pathlib.Path(user_dir) / "default_model.txt"
+    assert not config_path.exists()
+    assert llm.get_default_model() == "gpt-4o-mini"
+    assert llm.get_model().model_id == "gpt-4o-mini"
+    llm.set_default_model("gpt-4o")
+    assert config_path.exists()
+    assert llm.get_default_model() == "gpt-4o"
+    assert llm.get_model().model_id == "gpt-4o"