-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IPU] add bert-base model for ipu #1793
[IPU] add bert-base model for ipu #1793
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,Thanks contribution from GraphCore. @guoshengCS FYI
@guoshengCS We can add this examples as PaddleNLP 2.3 feature. |
请问对于这个 PR 有任何建议或者合入的计划吗? @guoshengCS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
很赞的工作,感谢贡献,抱歉有所耽搁~ @gglin001 |
谢谢~~ 接下来我们会在新的 SDK 上进行测试, 可能还会有一些更新加入进来哈, 到时麻烦继续 review 下 😉 |
* add bert-base model for ipu * use hdf5 dataset * add enable_engine_caching param * use HF dataset for squad task * update readme
PR types
Others
PR changes
Models
Description
添加 BERT-base 模型在 IPU 上运行的支持, 包含 pretrainning 和 SQuAD 两个 task.
模型使用静态图构建, 使用半精度训练 ,最终性能和精度运行结果如下:
全部的代码放在了
examples/language_model/bert/static_ipu/
目录下, 文件详情请参考README.md
主要修改:
为了实现在 IPU 上训练的最佳性能, 模型构图部分使用了自定义的
modeling.py
添加了部分 ipu 的自定义算子用于构建模型(主要为了性能方面的考虑)
由于需要对模型的输入做 remask 操作, 使用单一线程载入数据集会导致数据载入耗时比较长, 影响 end2end 的 throughput, dataset 部分使用了自定义的dataloader, 使用了多个进程做 remask 操作, 见
dataset_ipu.py
load_tf_ckpt.py
用于映射 Google发布的 BERT pretrain weight https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zipPR主要用于评估代码是否适合放在 paddlenlp 仓库中, 麻烦 reviewer 提出宝贵意见 😊