Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement prompt/instruction templates #2439

Closed
tomaarsen opened this issue Jan 23, 2024 · 4 comments · Fixed by #2477
Closed

Implement prompt/instruction templates #2439

tomaarsen opened this issue Jan 23, 2024 · 4 comments · Fixed by #2477

Comments

@tomaarsen
Copy link
Collaborator

tomaarsen commented Jan 23, 2024

Hello!

Context

This issue describes a feature that I am planning to be included in a release before v3, or alternatively, in v3 of Sentence Transformers.

Details

Many recent works, e.g. Wang et al., 2024, Li & Li, 2023, Xiao et al., 2023, and many more, use instructions/prompts to improve their model performance and instruct the models on the specific task at hand.

Ideally, Sentence Transformers should support this more easily by allowing prompt/instruction templates to be stored in the model configuration. For example, we could include the following two options in the configuration (e.g. config_sentence_transformers.json):

{
    ...
    "prompts": {
        "classification": "Classify the following text:",
        "retrieval": "Retrieve semantically similar text:",
        "clustering": "Identify the topic or theme based on the text:",
    },
    "default_prompt_name": "classification",
}

And then the SentenceTransformers.encode method would also support prompt and prompt_name arguments:

# Using a custom prompt
embeddings = model.encode(texts, prompt="Identify the topics:")
# Using a prompt from the config
embeddings = model.encode(texts, prompt_name="clustering")
# Using the default config, if one is defined
embeddings = model.encode(texts)

I am still very unsure about the names of all of these arguments - I think they're not amazing. Additionally, I'm considering whether the prompt should include {}, which will be filled with prompt.format(text). This would allow "Classify this text: {}. That was all." or something, but then the end of the text will be cut off in the case of truncation, which is not great.

I'm definitely open to suggestions or ideas here!

cc @bwanglzu @ir2718 @johneckberg @aamir-s18 as I know you're interested in my TODO list.
cc @intfloat

  • Tom Aarsen
@arbi-dev
Copy link

This would be an important feature, as currently users of these models either miss out on the instruction features or have to use their own template to concatenate instruction+query to take advantage of it.

For best results probably best to stick as closely as possible to the format used by the relevant model during training (including punctuation etc). E.g. BGE and Instructor seam to prepend the instruction.

@tomaarsen
Copy link
Collaborator Author

One option is to allow users to specify prompts with {} in them, e.g. "Please embed the sentence {} into a short text.". Model authors can specify their own prompts into their models. That way, model authors can ensure that the prompts always correspond to whatever was using during training.

  • Tom Aarsen

@tomaarsen
Copy link
Collaborator Author

We may also want to include some configuration options of whether the instruction must be included in the pooling output? For example, for INSTRUCTOR, these instructions are removed via attention masking when pooling.

  • Tom Aarsen

@ShengYun-Peng
Copy link

Hi @tomaarsen, thanks for adding this new prompt feature in the library! I'm curious if there's a way to use prompt along with the evaluator. Currently, I only find examples by passing the prompt_name or prompt to model.encode, but all evaluators don't seem to take any prompt arguments, thus prohibiting evaluation on a full test set with the required prompts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants