Add support for Google's PaLM 2 #20

simonw · 2023-06-15T06:12:58Z

No description provided.

simonw · 2023-06-15T06:14:33Z

I have access now. I managed to get an API key I can use with the text-bison-001 model via Google Vertex AI.

https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/text-bison

The API call looks like this:

url = "https://generativelanguage.googleapis.com/v1beta2/models/text-bison-001:generateText?key={}".format(
    api_key
)
response = requests.post(
    url,
    json={"prompt": {"text": prompt}},
    headers={"Content-Type": "application/json"},
)
output = response.json()["candidates"][0]["output"]

simonw · 2023-06-15T06:19:23Z

The thing I'm having trouble with here is what to call this.

I want a command similar to llm openai ... but for this family of models.

Naming options:

llm google
llm vertex (does anyone know what Vertex is?)
llm palm (but should this be palm2 and will there be non-palm models in here?)
llm bison (again, not sure people understand the terminology)

There are a ton of other models available in the Vertex "model garden": https://console.cloud.google.com/vertex-ai/model-garden - T5-FLAN, Stable Diffusion, BLIP, all sorts of things.

It's so very confusing in there! Many of them don't seem to have HTTP API endpoints - some appear to be available only via a notebook interface.

simonw · 2023-06-15T06:21:07Z

I'm tempted to go with llm google purely for consistency with llm openai - then maybe llm anthropic can follow?

Maybe the vendors themselves are a distraction - the thing that matters is the model. I've kind of broken this already though by having GPT-4 as a -4 flag on what I initially called the chatgpt command.

simonw · 2023-06-15T06:22:23Z

For PaLM 2 itself I think the models available to me are text-bison and code-bison and code-gecko.

simonw · 2023-06-15T06:24:16Z

I'm going to land a llm palm2 command for the moment, then go back to this issue and reconsider:

Reconsider llm chatgpt command and general command design #17

simonw · 2023-06-15T06:27:18Z

This looks useful: https://developers.generativeai.google/api/rest/generativelanguage/models/list

curl https://generativelanguage.googleapis.com/v1beta2/models?key=$PALM_API_KEY

I currently get this:

{
  "models": [
    {
      "name": "models/chat-bison-001",
      "version": "001",
      "displayName": "Chat Bison",
      "description": "Chat-optimized generative language model.",
      "inputTokenLimit": 4096,
      "outputTokenLimit": 1024,
      "supportedGenerationMethods": [
        "generateMessage"
      ],
      "temperature": 0.25,
      "topP": 0.95,
      "topK": 40
    },
    {
      "name": "models/text-bison-001",
      "version": "001",
      "displayName": "Text Bison",
      "description": "Model targeted for text generation.",
      "inputTokenLimit": 8196,
      "outputTokenLimit": 1024,
      "supportedGenerationMethods": [
        "generateText"
      ],
      "temperature": 0.7,
      "topP": 0.95,
      "topK": 40
    },
    {
      "name": "models/embedding-gecko-001",
      "version": "001",
      "displayName": "Embedding Gecko",
      "description": "Obtain a distributed representation of a text.",
      "inputTokenLimit": 1024,
      "outputTokenLimit": 1,
      "supportedGenerationMethods": [
        "embedText"
      ]
    }
  ]
}

So no code-bison or code-gecko listed there.

simonw · 2023-06-15T06:31:06Z

https://developers.generativeai.google/api/rest/generativelanguage/models/countMessageTokens can count tokens:

curl https://generativelanguage.googleapis.com/v1beta2/models/chat-bison-001:countMessageTokens?key=$PALM_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
        "prompt": {
            "messages": [
                {"content":"How many tokens?"},
                {"content": "For this whole conversation?" }
            ]
        }
    }'

{
  "tokenCount": 23
}

simonw · 2023-06-15T06:34:38Z

I'm just going to ship text-bison-001 for the moment. I'm going to call the command palm2.

simonw · 2023-06-15T06:35:32Z

Where should it get the API token from? The way I do it for OpenAI tokens right now is bad and needs fixing:

Better ways of storing and accessing API keys #13

llm/llm/cli.py

Lines 162 to 173 in 293f306

    
           def get_openai_api_key(): 
        
               # Expand this to home directory / ~.openai-api-key.txt 
        
               if "OPENAI_API_KEY" in os.environ: 
        
                   return os.environ["OPENAI_API_KEY"] 
        
               path = os.path.expanduser("~/.openai-api-key.txt") 
        
               # If the file exists, read it 
        
               if os.path.exists(path): 
        
                   with open(path) as fp: 
        
                       return fp.read().strip() 
        
               raise click.ClickException( 
        
                   "No OpenAI API key found. Set OPENAI_API_KEY environment variable or create ~/.openai-api-key.txt" 
        
               )

simonw · 2023-06-15T07:37:59Z

My code so far:

@cli.command()
@click.argument("prompt", required=False)
@click.option("-m", "--model", help="Model to use", default="text-bison-001")
@click.option("-n", "--no-log", is_flag=True, help="Don't log to database")
def palm2(prompt, model, no_log):
    "Execute a prompt against a PaLM 2 model"
    if prompt is None:
        # Read from stdin instead
        prompt = sys.stdin.read()
    api_key = get_vertex_api_key()
    url = "https://generativelanguage.googleapis.com/v1beta2/models/text-bison-001:generateText?key={}".format(
        api_key
    )
    response = requests.post(
        url,
        json={"prompt": {"text": prompt}},
        headers={"Content-Type": "application/json"},
    )
    output = response.json()["candidates"][0]["output"]
    log(no_log, "vertex", None, prompt, output, model)
    click.echo(output)

sderev · 2023-06-15T07:47:59Z

    if prompt is None:
        # Read from stdin instead
        prompt = sys.stdin.read()

This poses a problem with the UX. This is the reason why llm happens to hang indefinitely when no argument is passed.

I suggested a fix for this in PR#19.

simonw · 2023-06-15T08:23:29Z

    if prompt is None:
        # Read from stdin instead
        prompt = sys.stdin.read()
This poses a problem with the UX. This is the reason why llm happens to hang indefinitely when no argument is passed.

That's a deliberate design decision at the moment - it means you can run llm and then copy-and-paste text into your terminal.

There are other common unix commands that work like this - cat and wc for example - so I'm not convinced it's a usability problem. Happy to hear further discussion around that though.

Since it's possible to detect this situation, perhaps a message to stderr reminding the user to type or paste in content and hit Ctrl+D when they are done would be appropriate? I've not seen any other commands that do that though.

sderev · 2023-06-15T08:32:43Z

That's a deliberate design decision at the moment - it means you can run llm and then copy-and-paste text into your terminal.

I'm not sure to understand. When I run llm, no matter what I paste in my terminal, it just keeps waiting. Even if type "say hello" and press Enter.

Oh... my bad! Okay, I hit Ctrl + D and it responded. It's not very intuitive—though, I may be alone in that category.

As of now, I would find it better to print the helper message by default, as I suggested in my pull request. However, if an instruction can be shown to the user via stderr, it might solve what I believe is an UX issue.

simonw · 2023-06-15T17:37:41Z

Now that I've renamed llm chatgpt to llm prompt I'm going to try adding this model to that command instead, so you would use it like so:

llm -m palm2 "Five surprising names for a wise owl"

I'll support -m text-bison as well.

simonw · 2023-06-26T15:25:42Z

In that new prototype branch:

% llm 'Two names for a beaver' -m palm2
Daggett and Dasher

simonw · 2023-06-26T15:47:07Z

Figuring out chat mode for Vertex/PaLM2 is proving hard.

https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/1?project=cloud-vision-ocr-382418 talks about "PaLM 2 for Chat".

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/chat.py seems to be the most relevant example code:

from vertexai.preview.language_models import ChatModel, InputOutputTextPair

def science_tutoring(temperature: float = 0.2) -> None:
    chat_model = ChatModel.from_pretrained("chat-bison@001")
    parameters = {
        "temperature": temperature,  # Temperature controls the degree of randomness in token selection.
        "max_output_tokens": 256,  # Token limit determines the maximum amount of text output.
        "top_p": 0.95,  # Tokens are selected from most probable to least until the sum of their probabilities equals the top_p value.
        "top_k": 40,  # A top_k of 1 means the selected token is the most probable among all tokens.
    }
    chat = chat_model.start_chat(
        context="My name is Miles. You are an astronomer, knowledgeable about the solar system.",
        examples=[
            InputOutputTextPair(
                input_text="How many moons does Mars have?",
                output_text="The planet Mars has two moons, Phobos and Deimos.",
            ),
        ],
    )
    response = chat.send_message(
        "How many planets are there in the solar system?", **parameters
    )
    print(f"Response from Model: {response.text}")
    return response

I think this is where vertexai comes from: https://github.com/googleapis/python-aiplatform - pip install google-cloud-aiplatform

simonw · 2023-06-26T15:51:15Z

Buried deep in a class hierarchy, this looks like the code that actually constructs the JSON to call the API: https://github.com/googleapis/python-aiplatform/blob/c60773a7db8ce7a59d2cb5787dc90937776c0b8f/vertexai/language_models/_language_models.py#L697-L824

The API call then goes through this code:

        prediction_response = self._model._endpoint.predict(
            instances=[prediction_instance],
            parameters=prediction_parameters,
        )

I have not yet tracked down that self._model._endpoint.predict() method.

simonw · 2023-07-01T19:22:41Z

Some useful hints in https://github.com/google/generative-ai-docs/blob/main/site/en/tutorials/chat_quickstart.ipynb - including that PaLM 2 has a "context" concept which appears to be the same thing as an OpenAI system prompt:

reply = palm.chat(context="Speak like Shakespeare.", messages='Hello')
print(reply.last)

Hello there, my good fellow! How fares thee this day?

reply = palm.chat(
    context="Answer everything with a haiku, following the 5/7/5 rhyme pattern.",
    messages="How's it going?"
)
print(reply.last)

I am doing well
I am learning and growing
Every day is new

simonw · 2023-07-01T19:24:36Z

Based on that example notebook, I'm going to ditch the terminology "Vertex" and "PaLM 2" and just call it "PaLM". (They never released an API for PaLM 1).

I'm also going to move my code out of the experimental plugin and into a llm-palm package which depends on google-generativeai.

simonw · 2023-07-01T19:45:20Z

Got this out of the debugger, after this:

import google.generativeai as palm
kwargs = {"messages": self.prompt.prompt}
if self.prompt.system:
    kwargs["context"] = self.prompt.system

response = palm.chat(**kwargs)
last = response.last

(Pdb) pprint(response.to_dict())
{'candidate_count': None,
 'candidates': [{'author': '1',
                 'content': 'Here are three names for a dog:\n'
                            '\n'
                            '1. **Bailey** is a popular name for both male and '
                            'female dogs. It is of English origin and means '
                            '"bailiff" or "steward." Bailey is a friendly and '
                            'loyal dog that is always up for a good time.\n'
                            '2. **Luna** is a Latin name that means "moon." It '
                            'is a popular name for female dogs, but it can '
                            'also be used for male dogs. Luna is a beautiful '
                            'and intelligent dog that is always curious about '
                            'the world around her.\n'
                            '3. **Max** is a German name that means '
                            '"greatest." It is a popular name for male dogs, '
                            'but it can also be used for female dogs. Max is a '
                            'strong and courageous dog that is always willing '
                            'to protect his family.\n'
                            '\n'
                            'These are just a few of the many great names that '
                            'you could choose for your new dog. When choosing '
                            "a name, it is important to consider your dog's "
                            'personality and appearance. You should also '
                            'choose a name that you will be happy saying for '
                            'many years to come.'}],
 'context': '',
 'examples': [],
 'messages': [{'author': '0', 'content': 'three names for a dog'},
              {'author': '1',
               'content': 'Here are three names for a dog:\n'
                          '\n'
                          '1. **Bailey** is a popular name for both male and '
                          'female dogs. It is of English origin and means '
                          '"bailiff" or "steward." Bailey is a friendly and '
                          'loyal dog that is always up for a good time.\n'
                          '2. **Luna** is a Latin name that means "moon." It '
                          'is a popular name for female dogs, but it can also '
                          'be used for male dogs. Luna is a beautiful and '
                          'intelligent dog that is always curious about the '
                          'world around her.\n'
                          '3. **Max** is a German name that means "greatest." '
                          'It is a popular name for male dogs, but it can also '
                          'be used for female dogs. Max is a strong and '
                          'courageous dog that is always willing to protect '
                          'his family.\n'
                          '\n'
                          'These are just a few of the many great names that '
                          'you could choose for your new dog. When choosing a '
                          "name, it is important to consider your dog's "
                          'personality and appearance. You should also choose '
                          'a name that you will be happy saying for many years '
                          'to come.'}],
 'model': 'models/chat-bison-001',
 'temperature': None,
 'top_k': None,
 'top_p': None}

simonw · 2023-07-01T19:48:33Z

That library doesn't have streaming support yet, issues here:

simonw · 2023-07-01T19:55:11Z

The context (PaLM's version of the system prompt) isn't very strongly held to:

Here's GPT-4:

% llm -m 4 'three names for a pet pelican, succinctly' --system 'translate to french'
trois noms pour un pélican domestique, succinctement

PaLM messes that one up:

% llm -m palm 'three names for a pet pelican, succinctly' --system 'translate to french'
{'model': 'models/chat-bison-001', 'context': 'translate to french', 'examples': [], 'messages': [{'author': '0', 'content': 'three names for a pet pelican, succinctly'}, {'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': [{'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}]}
Here are three names for a pet pelican:

* Pete
* Piper
* Peanut
% llm -m palm 'three names for a pet pelican, succinctly' --system 'you are a bot that translates everything to french'
{'model': 'models/chat-bison-001', 'context': 'you are a bot that translates everything to french', 'examples': [], 'messages': [{'author': '0', 'content': 'three names for a pet pelican, succinctly'}, {'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': [{'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}]}
Here are three names for a pet pelican:

* Pete
* Piper
* Peanut

This example from the PaLM example notebook does work though:

% llm -m palm 'Hello' --system 'Speak like Shakespeare'                                                                
{'model': 'models/chat-bison-001', 'context': 'Speak like Shakespeare', 'examples': [], 'messages': [{'author': '0', 'content': 'Hello'}, {'author': '1', 'content': 'Hello there, my good fellow! How fares thee on this fine day?'}], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': [{'author': '1', 'content': 'Hello there, my good fellow! How fares thee on this fine day?'}]}
Hello there, my good fellow! How fares thee on this fine day?

simonw · 2023-07-01T20:07:40Z

This was surprising:

llm -m palm --system "Translate to french" "I like pelicans a lot"

{'model': 'models/chat-bison-001', 'context': 'Translate to french', 'examples': [], 'messages': [{'author': '0', 'content': 'I like pelicans a lot'}, None], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': []}

I got back a None where I expected a message - and response.last return None too.

Refs simonw/llm#20 (comment)

simonw · 2023-07-01T20:13:15Z

Extracted this out to a separate plugin: https://github.com/simonw/llm-palm

Closing this issue - future work will happen there instead.

It lives here now: https://github.com/simonw/llm-palm Refs #20

simonw added the enhancement New feature or request label Jun 15, 2023

simonw mentioned this issue Jun 15, 2023

Better ways of storing and accessing API keys #13

Closed

simonw added this to the 0.4 milestone Jun 15, 2023

This was referenced Jun 15, 2023

Way to configure the default model #31

Closed

llm models command #33

Closed

simonw modified the milestones: 0.4, 0.5 Jun 16, 2023

This was referenced Jun 17, 2023

Plugin mechanism for registering extra commands #49

Closed

Plugin hook: register_models #53

Closed

simonw mentioned this issue Jun 26, 2023

register_models() plugin hook #65

Merged

5 tasks

simonw added a commit that referenced this issue Jun 26, 2023

Implemented PaLM 2, to test out new plugin hook - refs #20

ce2a322

simonw added a commit to simonw/llm-palm that referenced this issue Jul 1, 2023

Handle response.last of None

94cf28e

Refs simonw/llm#20 (comment)

simonw closed this as completed Jul 1, 2023

simonw mentioned this issue Jul 1, 2023

Streaming support - once PaLM adds that to their API simonw/llm-palm#1

Open

simonw added a commit that referenced this issue Jul 1, 2023

Removed PaLM 2 vertex model

6c0b51e

It lives here now: https://github.com/simonw/llm-palm Refs #20

simonw added a commit that referenced this issue Jul 10, 2023

Implemented PaLM 2, to test out new plugin hook - refs #20

64747af

simonw added a commit that referenced this issue Jul 10, 2023

Removed PaLM 2 vertex model

5d45aa8

It lives here now: https://github.com/simonw/llm-palm Refs #20

simonw added a commit that referenced this issue Jul 10, 2023

Implemented PaLM 2, to test out new plugin hook - refs #20

76a08c7

simonw added a commit that referenced this issue Jul 10, 2023

Removed PaLM 2 vertex model

e3b040a

It lives here now: https://github.com/simonw/llm-palm Refs #20

simonw added a commit that referenced this issue Jul 10, 2023

Implemented PaLM 2, to test out new plugin hook - refs #20

5e056fa

simonw added a commit that referenced this issue Jul 10, 2023

Removed PaLM 2 vertex model

183b647

It lives here now: https://github.com/simonw/llm-palm Refs #20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Google's PaLM 2 #20

Add support for Google's PaLM 2 #20

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023 •

edited

Loading

sderev commented Jun 15, 2023 •

edited

Loading

simonw commented Jun 15, 2023 •

edited

Loading

sderev commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023 •

edited

Loading

simonw commented Jun 26, 2023

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

Add support for Google's PaLM 2 #20

Add support for Google's PaLM 2 #20

Comments

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 15, 2023 • edited Loading

sderev commented Jun 15, 2023 • edited Loading

simonw commented Jun 15, 2023 • edited Loading

sderev commented Jun 15, 2023

simonw commented Jun 15, 2023

simonw commented Jun 26, 2023

simonw commented Jun 26, 2023 • edited Loading

simonw commented Jun 26, 2023

simonw commented Jul 1, 2023 • edited Loading

simonw commented Jul 1, 2023 • edited Loading

simonw commented Jul 1, 2023 • edited Loading

simonw commented Jul 1, 2023 • edited Loading

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jul 1, 2023

simonw commented Jun 15, 2023 •

edited

Loading

sderev commented Jun 15, 2023 •

edited

Loading

simonw commented Jun 15, 2023 •

edited

Loading

simonw commented Jun 26, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading

simonw commented Jul 1, 2023 •

edited

Loading