register_models() plugin hook #65

simonw · 2023-06-26T14:44:40Z

Plugin hook: register_models #53

TODO:

Get continue mode working again - now Introduce conversations concept for continue mode #85
Get logging to the database working again
Write documentation
Figure out how to handle setting the OpenAI API key
Demonstrate the design works with a PaLM 2 plugin prototype, refs Add support for Google's PaLM 2 #20
~~Prototype of functions support to further validate the design~~
Get all the tests passing again

llm/cli.py

simonw · 2023-06-26T15:38:47Z

For continue mode, it's all about building the prompt.

Here's the current implementation:

llm/llm/cli.py

Lines 176 to 184 in 9190051

    
           chat_id, history = get_history(_continue) 
        
           history_model = None 
        
           if history: 
        
               for entry in history: 
        
                   if entry.get("system"): 
        
                       messages.append({"role": "system", "content": entry["system"]}) 
        
                   messages.append({"role": "user", "content": entry["prompt"]}) 
        
                   messages.append({"role": "assistant", "content": entry["response"]}) 
        
                   history_model = entry["model"]

The key bit of get_history() is this:

llm/llm/cli.py

Lines 552 to 555 in 9190051

    
           rows = db["log"].rows_where( 
        
               "id = ? or chat_id = ?", [chat_id, chat_id], order_by="id" 
        
           ) 
        
           return chat_id, rows

So maybe this is all about building a prompt an alternative way, perhaps like this:

# History is a list of logged messages:
prompt = Prompt.from_history(history, prompt, system)

simonw · 2023-06-26T15:40:18Z

OpenAI chat models work by building up that messages= array, but other models could potentially also support conversations through custom prompt assembly - by putting together a User: X\nAssistant: Y\nUser: Z\nAssistant: string for example.

simonw · 2023-06-26T15:42:48Z

It looks like PaLM2 has a separate chat mode, which is mainly documented through code examples:

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/chat.py

https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/text-bison?project=cloud-vision-ocr-382418

simonw · 2023-06-26T15:54:57Z

Untangling chat mode for PaLM 2 is a bigger job than I want to take on right now - more notes here: #20 (comment)

simonw · 2023-06-26T22:27:49Z

I'm not happy with how streaming works yet.

I think models should have the following methods:

.prompt(...) - run a prompt and return the response
.stream(...) - run a prompt and return an object that can be used to iterate over a streaming response
.chain(...) - run a prompt with the ability for it to trigger additional prompt executions (needed for things like OpenAI functions). Returns an iterator over those prompt responses.
.chain_stream(...) - same again, but each returned object can also be iterated over to obtain tokens from a stream.

simonw · 2023-07-01T15:24:16Z

I was contemplating if a Model even needs to be a class - could I instead define it purely in terms of functions?

Then I remembered that there are some LLMs that run locally that have significant startup costs (loading data into memory/the GPU) - which is a good fit for a class, because that way they can be loaded once and then reused for subsequent calls to .prompt() and suchlike.

simonw · 2023-07-01T15:43:06Z

Copying in some miscellaneous notes I made on the API design:

The streaming vs not streaming thing still feels a bit ugly.

The key problem I'm trying to solve is providing a generic interface to an LLM that will enable the following:

CLI access through the same tool
Comparison prompts across multiple LLMs
Unified storage of prompts in SQLite
Web interface for all of them
Unified prompt template interface
The continue conversation feature
Basic secret key management

I want the simplest possible API to implement in order to solve all of these - adding new models should be as easy as possible.

So maybe two methods: prompt(...) and stream(...)

Perhaps have a Model and StreamingModel(Model) class - models can subclass either, if you subclass StreamingModel then you have to implement both methods.

If there is ever a model that only does streaming the subclass can still implement prompt() and have that call stream().

Maybe Prompt and Response could be inner classes on the Model subclass:

class VertexModel(Model):
    class Options(Model.Options):
        temperature: float
    class Response:
        ..

Prompt andResponse chould be responsible for turning themselves into loggable records.

It would be good to have a Python library utility mechanism for this logging, since then users could log prompts that their app is using easily.

Maybe it's the Model that manages logging since that can see both prompt and response?

Logging should be off by default but easy to opt into, maybe by providing a Log instance or callback?

A Record dataclass might be good for representing that. And a llm.log.Database class that can write those to SQLite.

So the internals documentation ends up covering these concepts:

Model
Template
Prompt
Response
Result
Logger?
Key
alias

For OpenAI functions I could introduce a .sequence() method which can run more then one prompt (or .multi() or .chain()).

So the API for a model is

response = model.prompt(...)

Or

for chunk in model.stream(...)

for response in model.chain(...):
    print(response.text())

Maybe this:

for response in model.chain_stream(...):
    for token in response:
        print(token, end="")

I'm going with response.text() and not response.text because you cannot await a property.

What abstraction could I build so other chains can be easily constructed? Like there's a thing where the user gets to define a function that takes a response and decides if there should be another prompt (which the functions stuff can then be built on):

def next(response, prompt):
    return None # or str or Prompt

model.chain(next)

model.chain(lambda response: "fact check this: $input", once=True)

So once=True means only do the chain function once. Or should they be the default and use repeat=True to keep it going?

simonw · 2023-07-01T15:43:43Z

I'm going to try a bit of documentation-driven development here.

simonw · 2023-07-01T15:51:15Z

Here's that first draft of the internals documentation - next step, actually implement what's described there: https://github.com/simonw/llm/blob/c2ec8a9c60ac38d152ed48ba8c7c067c8d2c9859/docs/python-api.md

simonw · 2023-07-01T17:47:32Z

The default implementation of continue mode can be really basic: it just grabs the text version of the prompts and responses from the previous messages into a list and joins them with newlines.

simonw · 2023-07-01T17:59:09Z

I'm still not clear on the best way to truncate messages in continue mode, right now I'm going to leave that and allow the model to return an error - but it would be good to have a strategy for that involving automatic truncating later on.

simonw · 2023-07-01T18:00:56Z

I don't think I should solve continuation mode until I've re-implemented logging to the database.

Although... I do want continue mode to be possible even without having database logging, at least at the Python API layer.

simonw · 2023-07-01T18:07:26Z

Got all the tests passing, partly by disabling the DB logging test.

simonw · 2023-07-01T18:13:04Z

E       AssertionError: assert 'id          ...03T20:26:40\n' == 'id          ...03T13:26:40\n'
E         Skipping 89 identical leading characters in diff, use -v to show
E         - 020-05-03T13:26:40
E         ?           ^^
E         + 020-05-03T20:26:40
E         ?           ^^
E         - babbage:2020-05-03    openai      2020-05-03T13:26:40
E         ?                                              ^^
E         + babbage:2020-05-03    openai      2020-05-03T20:26:40
E         ?                                              ^^

I think that test fails because of timezone differences between GitHub Actions and my laptop.

simonw · 2023-07-01T18:15:48Z

Docs can also be previewed here: https://llm--65.org.readthedocs.build/en/65/python-api.html

simonw · 2023-07-01T18:18:58Z

I think a Response should have a .prompt property that leads back to the prompt used for that response.

Maybe there should be a way in which these chain together, for the purposes of modeling conversations in situations where the SQLite log isn't being used?

Then perhaps you could do this:

response = model.prompt("Ten names for a pet pelican")
print(response.text())
response2 = response.reply("Make them more grandiose")
print(response2.text())

Problem with .reply() is: is that the same thing as .prompt() or .stream() or something else?

Does it make sense for that reply() method to exist on Response as opposed to on Model or even on Prompt?

simonw · 2023-07-01T18:20:31Z

Another option: the model itself could keep a in-memory cache of its previous prompts, such that you can then reply via the model.

I'm not keen on this though, because the conversation state shouldn't be shared by every user of the model instance in a situation like llm web where the model may be serving multiple conversations at once.

simonw · 2023-07-01T18:22:58Z

The current signature for Response:

llm/llm/models.py

Lines 28 to 33 in 69ce584

    
           class Response(ABC): 
        
               def __init__(self, prompt: Prompt): 
        
                   self.prompt = prompt 
        
                   self._chunks = [] 
        
                   self._debug = {} 
        
                   self._done = False

If it also grew a .model property for tracking the model instance that created it, it would have enough to execute replies to itself.

simonw · 2023-07-01T18:24:30Z

What would this look like if everything was represented in terms of chains of Prompts and Responses? Those chains could then be serialized and deserialized to SQLite, or to JSON or other formats too.

Especially when function start coming into play, there's something very interesting about storing a high fidelity representation of the full sequence of prompts and responses that got to the most recent state.

simonw · 2023-07-01T18:26:14Z

Whether or not something should stream is currently a property of the Response. That works: the .reply() method could pass on the streaming state to the next prompt that is executed.

simonw · 2023-07-01T18:31:08Z

_______________ ERROR collecting tests/test_cli_openai_models.py _______________
ImportError while importing test module '/home/runner/work/llm/llm/tests/test_cli_openai_models.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.8.17/x64/lib/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_cli_openai_models.py:2: in <module>
    from llm.cli import cli
/opt/hostedtoolcache/Python/3.8.17/x64/lib/python3.8/site-packages/llm/cli.py:7: in <module>
    from .plugins import pm, get_plugins, get_model_aliases, get_models_with_aliases
/opt/hostedtoolcache/Python/3.8.17/x64/lib/python3.8/site-packages/llm/plugins.py:21: in <module>
    mod = importlib.import_module(plugin)
/opt/hostedtoolcache/Python/3.8.17/x64/lib/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'llm.default_plugins'

simonw · 2023-07-01T18:35:26Z

Something very broken there. I tried python setup.py sdist and then installing llm into a fresh virtualenv:

pip install ~/Dropbox/Development/llm/dist/llm-0.4.1.tar.gz

And I get this error when I run it:

% llm --help
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/foop-q39PfZK-/bin/llm", line 5, in <module>
    from llm.cli import cli
  File "/Users/simon/.local/share/virtualenvs/foop-q39PfZK-/lib/python3.11/site-packages/llm/cli.py", line 7, in <module>
    from .plugins import pm, get_plugins, get_model_aliases, get_models_with_aliases
  File "/Users/simon/.local/share/virtualenvs/foop-q39PfZK-/lib/python3.11/site-packages/llm/plugins.py", line 21, in <module>
    mod = importlib.import_module(plugin)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'llm.default_plugins'

simonw · 2023-07-01T18:58:11Z

Fixed the setup.py bundling issue. Tests all pass now!

simonw · 2023-07-01T19:00:42Z

Now that some models live in llm.default_models.openai_models they are pretty inconvenient to import.

I'm tempted to add this helper:

from llm import get_model
gpt4 = get_model("gpt-4")

This will provide Python API level access to both the model plugins mechanism and the aliases mechanism.

This means that if you do this: llm -m markov -o length -1 You will see an error message rather than have the command hang waiting for a prompt to be entered on stdin.

Refs #53 Based on https://til.simonwillison.net/python/pyproject Closes #81

To ensure any changes made by ruff --fix are reformatted.

- Moved a whole bunch of things from llm/cli.py into llm/__init__.py - Switched plugin listings to use importlib.metadata to avoid deprecation warning - iter_prompt() is now a method on Model, not on Response

simonw · 2023-07-10T15:38:54Z

I wanted to rebase this branch, but GitHub said there were conflicts.

Following https://stackoverflow.com/a/50012219/6083 I ran these commands locally:

git checkout register-models
git rebase main
# There was a conflict in setup.py which I fixed
git add setup.py
git rebase --continue
git push --force-with-lease

simonw · 2023-07-10T15:47:29Z

I don't like how that git push --force-with-lease seems to have reset the commit date on all of my commits though.

simonw · 2023-07-10T17:10:19Z

Looks like I can fix that with:

git filter-repo --commit-callback '
    if commit.committer_date == b"Mon Jul 10 08:39:00 2023 -0700":
        commit.committer_date = commit.author_date
' --force

Parsed 172 commits
New history written in 0.03 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 7ac007a Show error for --continue mode, remove deleted code
Enumerating objects: 907, done.
Counting objects: 100% (907/907), done.
Delta compression using up to 12 threads
Compressing objects: 100% (271/271), done.
Writing objects: 100% (907/907), done.
Total 907 (delta 621), reused 907 (delta 621), pack-reused 0
Completely finished after 0.16 seconds.

simonw · 2023-07-10T17:18:08Z

That didn't quite work - it didn't actually update the commit dates.

This helped debug it:

git filter-repo --commit-callback '
    print(repr(commit.committer_date))
' --force

Turns out those dates look like this:

b'1689001152 -0700'
b'1689002848 -0700'
b'1689003110 -0700'
b'1689003276 -0700'

So I ran this command:

git filter-repo --commit-callback '
    if commit.committer_date == b"1689003540 -0700":
        commit.committer_date = commit.author_date
' --force

And then force pushed it all to main:

git push --force-with-lease origin main

simonw · 2023-07-10T17:31:42Z

Turned that into a TIL: https://til.simonwillison.net/git/git-filter-repo

simonw commented Jun 26, 2023

View reviewed changes

llm/cli.py Outdated Show resolved Hide resolved

simonw mentioned this pull request Jun 29, 2023

Support for JSONFormer or similar #64

Open

simonw added a commit that referenced this pull request Jul 1, 2023

First attempt at internal API docs, refs #65

c2ec8a9

simonw mentioned this pull request Jul 1, 2023

Add ruff for linting #72

Closed

simonw mentioned this pull request Jul 1, 2023

Figure out truncation strategy for continue conversation mode #73

Open

simonw mentioned this pull request Jul 1, 2023

llm.get_model(alias) helper #75

Closed

simonw added 18 commits July 10, 2023 08:36

Default __str__ method for models

3cda359

llm logs now decodes JSON for prompt_json etc

f9fc69b

iter_prompt() now takes prompt

d05f3a1

Markov plugin now lives in llm-markov repo

a0e2e9b

Options base class is now llm.Options not Model.Options

e20e8c1

Read prompt after validating options

8dd0667

This means that if you do this: llm -m markov -o length -1 You will see an error message rather than have the command hang waiting for a prompt to be entered on stdin.

Detailed tutorial on writing plugins

543443c

types-click

6fd860c

Fix lin to Gist in tutorial

a16d680

Snappier tutorial title

674728c

Fixed type hint on Prompt

b5638f6

Switch tutorial from setup.py to pyproject.toml

c45ccfc

Refs #53 Based on https://til.simonwillison.net/python/pyproject Closes #81

Run black at end of just fix

b6f345f

To ensure any changes made by ruff --fix are reformatted.

Moved iter_prompt from Response to Model, moved a lot of other stuff

f4e9f17

- Moved a whole bunch of things from llm/cli.py into llm/__init__.py - Switched plugin listings to use importlib.metadata to avoid deprecation warning - iter_prompt() is now a method on Model, not on Response

Renamed iter_prompt() to execute() and updated tutorial

e6c8fa7

Renamed template.execute() to template.evaluate() and added type hints

292c836

Updated Gist example for tutorial

6f3d5e3

Show error for --continue mode, remove deleted code

d21e1b9

simonw force-pushed the register-models branch from 9e9d2c1 to d21e1b9 Compare July 10, 2023 15:36

simonw merged commit 845d0f0 into main Jul 10, 2023
10 checks passed

simonw added a commit that referenced this pull request Jul 10, 2023

First attempt at internal API docs, refs #65

2d317a8

simonw deleted the register-models branch July 10, 2023 15:39

simonw added a commit that referenced this pull request Jul 10, 2023

First attempt at internal API docs, refs #65

2911975

simonw referenced this pull request in simonw/til Jul 10, 2023

Using git-filter-repo to set commit dates to author dates

fc4c8d4

simonw mentioned this pull request Jul 10, 2023

Introduce conversations concept for continue mode #85

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

register_models() plugin hook #65

register_models() plugin hook #65

simonw commented Jun 26, 2023 •

edited

Loading

simonw commented Jun 26, 2023 •

edited

Loading

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

register_models() plugin hook #65

register_models() plugin hook #65

Conversation

simonw commented Jun 26, 2023 • edited Loading

simonw commented Jun 26, 2023 • edited Loading

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023 • edited Loading

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jul 10, 2023

simonw commented Jun 26, 2023 •

edited

Loading

simonw commented Jun 26, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading