Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Google's PaLM 2 #20

Closed
simonw opened this issue Jun 15, 2023 · 24 comments
Closed

Add support for Google's PaLM 2 #20

simonw opened this issue Jun 15, 2023 · 24 comments
Labels
enhancement New feature or request
Milestone

Comments

@simonw
Copy link
Owner

simonw commented Jun 15, 2023

No description provided.

@simonw simonw added the enhancement New feature or request label Jun 15, 2023
@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

I have access now. I managed to get an API key I can use with the text-bison-001 model via Google Vertex AI.

https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/text-bison

The API call looks like this:

url = "https://generativelanguage.googleapis.com/v1beta2/models/text-bison-001:generateText?key={}".format(
    api_key
)
response = requests.post(
    url,
    json={"prompt": {"text": prompt}},
    headers={"Content-Type": "application/json"},
)
output = response.json()["candidates"][0]["output"]

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

The thing I'm having trouble with here is what to call this.

I want a command similar to llm openai ... but for this family of models.

Naming options:

  • llm google
  • llm vertex (does anyone know what Vertex is?)
  • llm palm (but should this be palm2 and will there be non-palm models in here?)
  • llm bison (again, not sure people understand the terminology)

There are a ton of other models available in the Vertex "model garden": https://console.cloud.google.com/vertex-ai/model-garden - T5-FLAN, Stable Diffusion, BLIP, all sorts of things.

It's so very confusing in there! Many of them don't seem to have HTTP API endpoints - some appear to be available only via a notebook interface.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

I'm tempted to go with llm google purely for consistency with llm openai - then maybe llm anthropic can follow?

Maybe the vendors themselves are a distraction - the thing that matters is the model. I've kind of broken this already though by having GPT-4 as a -4 flag on what I initially called the chatgpt command.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

For PaLM 2 itself I think the models available to me are text-bison and code-bison and code-gecko.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

I'm going to land a llm palm2 command for the moment, then go back to this issue and reconsider:

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

This looks useful: https://developers.generativeai.google/api/rest/generativelanguage/models/list

curl https://generativelanguage.googleapis.com/v1beta2/models?key=$PALM_API_KEY

I currently get this:

{
  "models": [
    {
      "name": "models/chat-bison-001",
      "version": "001",
      "displayName": "Chat Bison",
      "description": "Chat-optimized generative language model.",
      "inputTokenLimit": 4096,
      "outputTokenLimit": 1024,
      "supportedGenerationMethods": [
        "generateMessage"
      ],
      "temperature": 0.25,
      "topP": 0.95,
      "topK": 40
    },
    {
      "name": "models/text-bison-001",
      "version": "001",
      "displayName": "Text Bison",
      "description": "Model targeted for text generation.",
      "inputTokenLimit": 8196,
      "outputTokenLimit": 1024,
      "supportedGenerationMethods": [
        "generateText"
      ],
      "temperature": 0.7,
      "topP": 0.95,
      "topK": 40
    },
    {
      "name": "models/embedding-gecko-001",
      "version": "001",
      "displayName": "Embedding Gecko",
      "description": "Obtain a distributed representation of a text.",
      "inputTokenLimit": 1024,
      "outputTokenLimit": 1,
      "supportedGenerationMethods": [
        "embedText"
      ]
    }
  ]
}

So no code-bison or code-gecko listed there.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

https://developers.generativeai.google/api/rest/generativelanguage/models/countMessageTokens can count tokens:

curl https://generativelanguage.googleapis.com/v1beta2/models/chat-bison-001:countMessageTokens?key=$PALM_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
        "prompt": {
            "messages": [
                {"content":"How many tokens?"},
                {"content": "For this whole conversation?" }
            ]
        }
    }'
{
  "tokenCount": 23
}

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

I'm just going to ship text-bison-001 for the moment. I'm going to call the command palm2.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

Where should it get the API token from? The way I do it for OpenAI tokens right now is bad and needs fixing:

llm/llm/cli.py

Lines 162 to 173 in 293f306

def get_openai_api_key():
# Expand this to home directory / ~.openai-api-key.txt
if "OPENAI_API_KEY" in os.environ:
return os.environ["OPENAI_API_KEY"]
path = os.path.expanduser("~/.openai-api-key.txt")
# If the file exists, read it
if os.path.exists(path):
with open(path) as fp:
return fp.read().strip()
raise click.ClickException(
"No OpenAI API key found. Set OPENAI_API_KEY environment variable or create ~/.openai-api-key.txt"
)

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

My code so far:

@cli.command()
@click.argument("prompt", required=False)
@click.option("-m", "--model", help="Model to use", default="text-bison-001")
@click.option("-n", "--no-log", is_flag=True, help="Don't log to database")
def palm2(prompt, model, no_log):
    "Execute a prompt against a PaLM 2 model"
    if prompt is None:
        # Read from stdin instead
        prompt = sys.stdin.read()
    api_key = get_vertex_api_key()
    url = "https://generativelanguage.googleapis.com/v1beta2/models/text-bison-001:generateText?key={}".format(
        api_key
    )
    response = requests.post(
        url,
        json={"prompt": {"text": prompt}},
        headers={"Content-Type": "application/json"},
    )
    output = response.json()["candidates"][0]["output"]
    log(no_log, "vertex", None, prompt, output, model)
    click.echo(output)

@sderev
Copy link
Contributor

sderev commented Jun 15, 2023

    if prompt is None:
        # Read from stdin instead
        prompt = sys.stdin.read()

This poses a problem with the UX. This is the reason why llm happens to hang indefinitely when no argument is passed.

I suggested a fix for this in PR#19.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

    if prompt is None:
        # Read from stdin instead
        prompt = sys.stdin.read()

This poses a problem with the UX. This is the reason why llm happens to hang indefinitely when no argument is passed.

That's a deliberate design decision at the moment - it means you can run llm and then copy-and-paste text into your terminal.

There are other common unix commands that work like this - cat and wc for example - so I'm not convinced it's a usability problem. Happy to hear further discussion around that though.

Since it's possible to detect this situation, perhaps a message to stderr reminding the user to type or paste in content and hit Ctrl+D when they are done would be appropriate? I've not seen any other commands that do that though.

@sderev
Copy link
Contributor

sderev commented Jun 15, 2023

That's a deliberate design decision at the moment - it means you can run llm and then copy-and-paste text into your terminal.

I'm not sure to understand. When I run llm, no matter what I paste in my terminal, it just keeps waiting. Even if type "say hello" and press Enter.

Oh... my bad! Okay, I hit Ctrl + D and it responded. It's not very intuitive—though, I may be alone in that category.

As of now, I would find it better to print the helper message by default, as I suggested in my pull request. However, if an instruction can be shown to the user via stderr, it might solve what I believe is an UX issue.

@simonw simonw added this to the 0.4 milestone Jun 15, 2023
@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

Now that I've renamed llm chatgpt to llm prompt I'm going to try adding this model to that command instead, so you would use it like so:

llm -m palm2 "Five surprising names for a wise owl"

I'll support -m text-bison as well.

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

In that new prototype branch:

% llm 'Two names for a beaver' -m palm2
Daggett and Dasher

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

Figuring out chat mode for Vertex/PaLM2 is proving hard.

https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/1?project=cloud-vision-ocr-382418 talks about "PaLM 2 for Chat".

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/chat.py seems to be the most relevant example code:

from vertexai.preview.language_models import ChatModel, InputOutputTextPair

def science_tutoring(temperature: float = 0.2) -> None:
    chat_model = ChatModel.from_pretrained("chat-bison@001")
    parameters = {
        "temperature": temperature,  # Temperature controls the degree of randomness in token selection.
        "max_output_tokens": 256,  # Token limit determines the maximum amount of text output.
        "top_p": 0.95,  # Tokens are selected from most probable to least until the sum of their probabilities equals the top_p value.
        "top_k": 40,  # A top_k of 1 means the selected token is the most probable among all tokens.
    }
    chat = chat_model.start_chat(
        context="My name is Miles. You are an astronomer, knowledgeable about the solar system.",
        examples=[
            InputOutputTextPair(
                input_text="How many moons does Mars have?",
                output_text="The planet Mars has two moons, Phobos and Deimos.",
            ),
        ],
    )
    response = chat.send_message(
        "How many planets are there in the solar system?", **parameters
    )
    print(f"Response from Model: {response.text}")
    return response

I think this is where vertexai comes from: https://github.com/googleapis/python-aiplatform - pip install google-cloud-aiplatform

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

Buried deep in a class hierarchy, this looks like the code that actually constructs the JSON to call the API: https://github.com/googleapis/python-aiplatform/blob/c60773a7db8ce7a59d2cb5787dc90937776c0b8f/vertexai/language_models/_language_models.py#L697-L824

The API call then goes through this code:

        prediction_response = self._model._endpoint.predict(
            instances=[prediction_instance],
            parameters=prediction_parameters,
        )

I have not yet tracked down that self._model._endpoint.predict() method.

@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

Some useful hints in https://github.com/google/generative-ai-docs/blob/main/site/en/tutorials/chat_quickstart.ipynb - including that PaLM 2 has a "context" concept which appears to be the same thing as an OpenAI system prompt:

reply = palm.chat(context="Speak like Shakespeare.", messages='Hello')
print(reply.last)
Hello there, my good fellow! How fares thee this day?
reply = palm.chat(
    context="Answer everything with a haiku, following the 5/7/5 rhyme pattern.",
    messages="How's it going?"
)
print(reply.last)
I am doing well
I am learning and growing
Every day is new

@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

Based on that example notebook, I'm going to ditch the terminology "Vertex" and "PaLM 2" and just call it "PaLM". (They never released an API for PaLM 1).

I'm also going to move my code out of the experimental plugin and into a llm-palm package which depends on google-generativeai.

@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

Got this out of the debugger, after this:

import google.generativeai as palm
kwargs = {"messages": self.prompt.prompt}
if self.prompt.system:
    kwargs["context"] = self.prompt.system

response = palm.chat(**kwargs)
last = response.last
(Pdb) pprint(response.to_dict())
{'candidate_count': None,
 'candidates': [{'author': '1',
                 'content': 'Here are three names for a dog:\n'
                            '\n'
                            '1. **Bailey** is a popular name for both male and '
                            'female dogs. It is of English origin and means '
                            '"bailiff" or "steward." Bailey is a friendly and '
                            'loyal dog that is always up for a good time.\n'
                            '2. **Luna** is a Latin name that means "moon." It '
                            'is a popular name for female dogs, but it can '
                            'also be used for male dogs. Luna is a beautiful '
                            'and intelligent dog that is always curious about '
                            'the world around her.\n'
                            '3. **Max** is a German name that means '
                            '"greatest." It is a popular name for male dogs, '
                            'but it can also be used for female dogs. Max is a '
                            'strong and courageous dog that is always willing '
                            'to protect his family.\n'
                            '\n'
                            'These are just a few of the many great names that '
                            'you could choose for your new dog. When choosing '
                            "a name, it is important to consider your dog's "
                            'personality and appearance. You should also '
                            'choose a name that you will be happy saying for '
                            'many years to come.'}],
 'context': '',
 'examples': [],
 'messages': [{'author': '0', 'content': 'three names for a dog'},
              {'author': '1',
               'content': 'Here are three names for a dog:\n'
                          '\n'
                          '1. **Bailey** is a popular name for both male and '
                          'female dogs. It is of English origin and means '
                          '"bailiff" or "steward." Bailey is a friendly and '
                          'loyal dog that is always up for a good time.\n'
                          '2. **Luna** is a Latin name that means "moon." It '
                          'is a popular name for female dogs, but it can also '
                          'be used for male dogs. Luna is a beautiful and '
                          'intelligent dog that is always curious about the '
                          'world around her.\n'
                          '3. **Max** is a German name that means "greatest." '
                          'It is a popular name for male dogs, but it can also '
                          'be used for female dogs. Max is a strong and '
                          'courageous dog that is always willing to protect '
                          'his family.\n'
                          '\n'
                          'These are just a few of the many great names that '
                          'you could choose for your new dog. When choosing a '
                          "name, it is important to consider your dog's "
                          'personality and appearance. You should also choose '
                          'a name that you will be happy saying for many years '
                          'to come.'}],
 'model': 'models/chat-bison-001',
 'temperature': None,
 'top_k': None,
 'top_p': None}

@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

The context (PaLM's version of the system prompt) isn't very strongly held to:

Here's GPT-4:

% llm -m 4 'three names for a pet pelican, succinctly' --system 'translate to french'
trois noms pour un pélican domestique, succinctement

PaLM messes that one up:

% llm -m palm 'three names for a pet pelican, succinctly' --system 'translate to french'
{'model': 'models/chat-bison-001', 'context': 'translate to french', 'examples': [], 'messages': [{'author': '0', 'content': 'three names for a pet pelican, succinctly'}, {'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': [{'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}]}
Here are three names for a pet pelican:

* Pete
* Piper
* Peanut
% llm -m palm 'three names for a pet pelican, succinctly' --system 'you are a bot that translates everything to french'
{'model': 'models/chat-bison-001', 'context': 'you are a bot that translates everything to french', 'examples': [], 'messages': [{'author': '0', 'content': 'three names for a pet pelican, succinctly'}, {'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': [{'author': '1', 'content': 'Here are three names for a pet pelican:\n\n* Pete\n* Piper\n* Peanut'}]}
Here are three names for a pet pelican:

* Pete
* Piper
* Peanut

This example from the PaLM example notebook does work though:

% llm -m palm 'Hello' --system 'Speak like Shakespeare'                                                                
{'model': 'models/chat-bison-001', 'context': 'Speak like Shakespeare', 'examples': [], 'messages': [{'author': '0', 'content': 'Hello'}, {'author': '1', 'content': 'Hello there, my good fellow! How fares thee on this fine day?'}], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': [{'author': '1', 'content': 'Hello there, my good fellow! How fares thee on this fine day?'}]}
Hello there, my good fellow! How fares thee on this fine day?

@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

This was surprising:

llm -m palm --system "Translate to french" "I like pelicans a lot"

{'model': 'models/chat-bison-001', 'context': 'Translate to french', 'examples': [], 'messages': [{'author': '0', 'content': 'I like pelicans a lot'}, None], 'temperature': None, 'candidate_count': None, 'top_p': None, 'top_k': None, 'candidates': []}

I got back a None where I expected a message - and response.last return None too.

simonw added a commit to simonw/llm-palm that referenced this issue Jul 1, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

Extracted this out to a separate plugin: https://github.com/simonw/llm-palm

Closing this issue - future work will happen there instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants