Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[prompt] update prompt api & add prefix template #3724

Merged
merged 20 commits into from
Nov 16, 2022

Conversation

LemonNoel
Copy link
Contributor

@LemonNoel LemonNoel commented Nov 10, 2022

PR types

New features

PR changes

APIs

Description

  1. Update Template String with 2 keywords and 7 attributes.
  2. Add PrefixTemplate.
  3. Optimize the implementation of Verbalizer.
  4. Delete MultiMaskVerbalizer.
  5. Delete InputExample and InputFeatures and use dict instead for flexibility, where keyword labels denotes the label id of a data sample.
  6. Update APIs in text classification applications.

Template new features

Keywords

  • options: Used list of labels in template by defining label_file_path / option_list_name_in_example_dict.
  • prefix: Add prefix soft tokens to inputs on every layer.

Attributes

  • position: Common. Define the start position id of following blocks.
  • token_type: Common. Define the token type id of following blocks.
  • add_omask: For options. Add [O-MASK] to each option.
  • add_prompt: For options. Add prompt text to each option.
  • encoder: For soft and prefix. Define the encoder type.
  • hidden_size: For soft and prefix. Define the hidden_size of encoder.
  • length: For mask, soft and prefix. Define the length of tokens.

@LemonNoel LemonNoel marked this pull request as ready for review November 15, 2022 06:53
@LemonNoel LemonNoel requested a review from ZHUI November 15, 2022 14:21
@LemonNoel LemonNoel changed the title [prompt] update template & add prefix template [prompt] update prompt api & add prefix template Nov 15, 2022
@LemonNoel LemonNoel requested a review from wawltor November 16, 2022 03:43

4. PaddleNLP 版本:2.3.5 (develop)
4. PaddleNLP 版本:2.4.2 (develop)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个要么直接2.4.3版本? 后面也可以直接2.4.3版本

最好能依赖关系放在requirements.txt里面

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改为2.4.3版本。相关依赖放在了 requirements_cpu.txtrequirements_gpu.txt 文件。

@@ -352,9 +353,9 @@ python infer.py --model_path_prefix checkpoints/export/model --data_dir ./data -
可配置参数说明:

- `model_path_prefix`: 导出的静态图模型路径及文件前缀。
- `model_name_or_path`: 内置预训练模型名,或者模型参数配置目录路径,用于加载tokenizer。默认为`ernie-3.0-base-zh`。
- `model_name`: 内置预训练模型名,用于加载tokenizer。默认为`ernie-3.0-base-zh`。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么model_name_or_path-> model_name ,不可以直接传入一个path进去?

Copy link
Contributor Author

@LemonNoel LemonNoel Nov 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里model_name只是为了加载预训练模型相关的参数,用于数据预处理。上边model_path_prefix才是 Prompt 模型的参数地址,之前两个都带 path 容易混淆。

@LemonNoel LemonNoel requested a review from wawltor November 16, 2022 05:22
Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LemonNoel LemonNoel merged commit fa04ecb into PaddlePaddle:develop Nov 16, 2022
@LemonNoel LemonNoel deleted the prompt branch November 24, 2022 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants