-
Notifications
You must be signed in to change notification settings - Fork 416
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #238 from Anhforth/add_bminf
Add bminf
- Loading branch information
Showing
42 changed files
with
208 additions
and
59 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
|
||
# BMInf | ||
|
||
## 简介/Overview | ||
|
||
BMInf is a low-resource inference package for large-scale pretrained language models. | ||
|
||
BMInf supports running models with more than 10 billion parameters on a single NVIDIA GTX 1060 GPU in its minimum requirements. Running with better GPUs leads to better performance. In cases where the GPU memory supports the large model inference (such as V100 or A100), BMInf still has a significant performance improvement over the existing PyTorch implementation. | ||
|
||
BMInf Github Repository address: https://github.com/OpenBMB/BMInf | ||
|
||
BMInf (Big Model Inference) 是一个用于大规模预训练语言模型(pretrained language models, PLM)推理阶段的低资源工具包。 | ||
|
||
BMInf最低支持在NVIDIA GTX 1060单卡运行百亿大模型。在此基础上,使用更好的gpu运行会有更好的性能。在显存支持进行大模型推理的情况下(如V100或A100显卡),BMInf的实现较现有PyTorch版本仍有较大性能提升。 | ||
|
||
BMInf 仓库地址:https://github.com/OpenBMB/BMInf | ||
|
||
## 应用/Application | ||
|
||
在模型加载参数之后,使用如下代码来用BMInf转换模型 | ||
|
||
```Python | ||
with torch.cuda.device(0): | ||
model = bminf.wrapper(model, quantization=False, memory_limit=20 << 30) | ||
``` | ||
The `quantization` parameter represents whether to use the model quantization technique, but if it is a generated class model, it needs to be set to `False`. | ||
|
||
You can use the `memory_limit` parameter to set the maximum available storage, the unit is Mb. | ||
|
||
`quantization`参数代表是否使用了模型量化的技巧,但如果是生成类模型,则需要设置成`False` | ||
|
||
可以用`memory_limit`参数设置最大的可用存储,单位为Mb | ||
|
||
如果`bminf.wrapper`不能很好的适配你的模型,你可以用以下的方法来进行手动适配。 | ||
|
||
* 将 `torch.nn.ModuleList` 替换为 `bminf.TransformerBlockList`. | ||
```python | ||
module_list = bminf.TransformerBlockList([ | ||
], [CUDA_DEVICE_INDEX]) | ||
``` | ||
|
||
* 将 `torch.nn.Linear` 替换为 `bminf.QuantizedLinear`. | ||
```python | ||
linear = bminf.QuantizedLinear(torch.nn.Linear(...)) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import torch | ||
from flagai.auto_model.auto_loader import AutoLoader | ||
from flagai.model.predictor.predictor import Predictor | ||
import bminf | ||
import time | ||
|
||
|
||
if __name__ == '__main__': | ||
|
||
text = '''默写古诗: | ||
白日依山尽,黄河入海流。 | ||
床前明月光,''' | ||
|
||
loader = AutoLoader(task_name="lm", | ||
model_name="CPM-large-ch", | ||
model_dir="./checkpoints", | ||
device="cpu") | ||
|
||
model = loader.get_model() | ||
time_start=time.time() | ||
with torch.cuda.device(0): | ||
model = bminf.wrapper(model, quantization=False, memory_limit=20 << 30) | ||
tokenizer = loader.get_tokenizer() | ||
|
||
predictor = Predictor(model=model, | ||
tokenizer=tokenizer, | ||
) | ||
|
||
out = predictor.predict_generate_randomsample(text, | ||
top_p=0.9, | ||
out_max_length=50) | ||
time_end=time.time() | ||
print('time cost',time_end-time_start,'s') | ||
|
||
print(out) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
|
||
from flagai.model.predictor.predictor import Predictor | ||
from flagai.auto_model.auto_loader import AutoLoader | ||
import torch | ||
import bminf | ||
import time | ||
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") | ||
|
||
|
||
loader = AutoLoader(task_name="lm", | ||
model_name="galactica-6.7b-en", | ||
model_dir="./checkpoints/") | ||
|
||
model = loader.get_model() | ||
with torch.cuda.device(0): | ||
model = bminf.wrapper(model, quantization=False, memory_limit=20 << 30) | ||
model.to(device) | ||
model.eval() | ||
tokenizer = loader.get_tokenizer() | ||
predictor = Predictor(model, tokenizer) | ||
print("model loaded") | ||
time_start=time.time() | ||
|
||
text = "Please write a abstract about the computer vision. \n" | ||
out = predictor.predict_generate_randomsample(text, | ||
out_max_length=700, | ||
top_k=50, | ||
repetition_penalty=1.2, | ||
temperature=0.7 | ||
) | ||
|
||
time_end=time.time() | ||
print('time cost',time_end-time_start,'s') | ||
print(out) | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Copyright © 2022 BAAI. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License") | ||
import torch | ||
from flagai.auto_model.auto_loader import AutoLoader | ||
from flagai.model.predictor.predictor import Predictor | ||
import bminf | ||
import time | ||
|
||
if __name__ == '__main__': | ||
|
||
loader = AutoLoader("seq2seq", | ||
"GPT2-base-ch", | ||
model_dir="./checkpoints/") | ||
model = loader.get_model() | ||
model = model.to('cpu') | ||
tokenizer = loader.get_tokenizer() | ||
time_start=time.time() | ||
with torch.cuda.device(0): | ||
model = bminf.wrapper(model, quantization=False, memory_limit=20 << 30) | ||
predictor = Predictor(model, tokenizer) | ||
|
||
text = "今天天气不错" | ||
|
||
out_2 = predictor.predict_generate_randomsample(text, | ||
input_max_length=512, | ||
out_max_length=100, | ||
repetition_penalty=1.5, | ||
top_k=20, | ||
top_p=0.8) | ||
|
||
time_end=time.time() | ||
print('time cost',time_end-time_start,'s') | ||
# print(f"out_1 is {out_1}") | ||
print(f"out_2 is {out_2}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,6 +22,4 @@ | |
repetition_penalty=1.2, | ||
temperature=0.7 | ||
) | ||
print(out) | ||
|
||
|
||
print(out) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.