-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ERNIE-ViLG models into Pipelines #3512
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
2086b45
Add ERNIE-ViLG models into Pipelines
w5688414 5b8d7dd
Add more comments
w5688414 ff47b75
Add missing text_to_image_generation.yaml
w5688414 bf94d11
Rename the files
w5688414 9d3027c
Add output_dir parameters& Update README.md
w5688414 3685147
Update README.md
w5688414 5cf3828
Merge branch 'develop' into pip31
w5688414 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
# ERNIE-ViLG 文生图系统 | ||
|
||
## 1. 场景概述 | ||
|
||
ERNIE-ViLG是一个知识增强跨模态图文生成大模型,将文生成图和图生成文任务融合到同一个模型进行端到端的学习,从而实现文本和图像的跨模态语义对齐。可以支持用户进行内容创作,让每个用户都能够体验到一个低门槛的创作平台。更多详细信息请参考官网的介绍[ernieVilg](https://wenxin.baidu.com/moduleApi/ernieVilg) | ||
|
||
|
||
## 2. 产品功能介绍 | ||
|
||
本项目提供了低成本搭建端到端文生图的能力。用户需要进行简单的参数配置,然后输入prompts就可以生成各种风格的画作,另外,Pipelines提供了 Web 化产品服务,让用户在本地端就能搭建起来文生图系统。 | ||
|
||
|
||
## 3. 快速开始: 快速搭建文生图系统 | ||
|
||
|
||
### 3.1 运行环境和安装说明 | ||
|
||
本实验采用了以下的运行环境进行,详细说明如下,用户也可以在自己的环境进行: | ||
|
||
a. 软件环境: | ||
- python >= 3.7.0 | ||
- paddlenlp >= 2.4.0 | ||
- paddlepaddle-gpu >=2.3 | ||
- CUDA Version: 10.2 | ||
- NVIDIA Driver Version: 440.64.00 | ||
- Ubuntu 16.04.6 LTS (Docker) | ||
|
||
b. 硬件环境: | ||
|
||
- NVIDIA Tesla V100 16GB x4卡 | ||
- Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz | ||
|
||
c. 依赖安装: | ||
首先需要安装PaddlePaddle,PaddlePaddle的安装请参考文档[官方安装文档](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html),然后安装下面的依赖: | ||
```bash | ||
# pip 一键安装 | ||
pip install --upgrade paddle-pipelines -i https://pypi.tuna.tsinghua.edu.cn/simple | ||
# 或者源码进行安装最新版本 | ||
cd ${HOME}/PaddleNLP/pipelines/ | ||
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple | ||
python setup.py install | ||
``` | ||
【注意】以下的所有的流程都只需要在`pipelines`根目录下进行,不需要跳转目录;另外,文生图系统需要联网,用户需要在有网的环境下进行。 | ||
|
||
|
||
### 3.2 一键体验文生图系统 | ||
|
||
在运行下面的命令之前,需要在[ERNIE-ViLG官网](https://wenxin.baidu.com/moduleApi/ernieVilg)申请`API Key`和 `Secret key`两个密钥(需要登录,登录后点击右上角的查看AK/SK,具体如下图),然后执行下面的命令。 | ||
|
||
<div align="center"> | ||
<img src="https://user-images.githubusercontent.com/12107462/196942735-06953270-ce1e-45a5-9e0d-5841068a8464.png" width="500"> | ||
</div> | ||
|
||
|
||
#### 3.2.1 快速一键启动 | ||
|
||
您可以通过如下命令快速体验文生图系统的效果 | ||
```bash | ||
python examples/text_to_image/text_to_image_example.py --prompt_text 宁静的小镇 \ | ||
--style 古风 \ | ||
--topk 5 \ | ||
--api_key 你申请的apikey \ | ||
--secret_key 你申请的secretkey \ | ||
--output_dir ernievilg_output | ||
``` | ||
大概运行一分钟后就可以得到结果了,生成的图片请查看您的输出目录`output_dir`。 | ||
|
||
### 3.3 构建 Web 可视化文生图系统 | ||
|
||
整个 Web 可视化文生图系统主要包含 2 大组件: 1. 基于 RestfulAPI 构建模型服务 2. 基于 Gradio 构建 WebUI,接下来我们依次搭建这 2 个服务并最终形成可视化的文生图系统。 | ||
|
||
#### 3.3.1 启动 RestAPI 模型服务 | ||
|
||
启动之前,需要把您申请的`API Key`和 `Secret key`两个密钥添加到`text_to_image.yaml`的ak和sk的位置,然后运行: | ||
|
||
```bash | ||
export PIPELINE_YAML_PATH=rest_api/pipeline/text_to_image.yaml | ||
# 使用端口号 8891 启动模型服务 | ||
python rest_api/application.py 8891 | ||
``` | ||
Linux 用户推荐采用 Shell 脚本来启动服务:: | ||
|
||
```bash | ||
sh examples/text_to_image/run_text_to_image.sh | ||
``` | ||
|
||
#### 3.3.2 启动 WebUI | ||
|
||
WebUI使用了[gradio前端](https://gradio.app/),首先需要安装gradio,运行命令如下: | ||
``` | ||
pip install gradio | ||
``` | ||
然后使用如下的命令启动: | ||
```bash | ||
# 配置模型服务地址 | ||
export API_ENDPOINT=http://127.0.0.1:8891 | ||
# 在指定端口 8502 启动 WebUI | ||
python ui/webapp_text_to_image.py --serving_port 8502 | ||
``` | ||
Linux 用户推荐采用 Shell 脚本来启动服务:: | ||
|
||
```bash | ||
sh examples/text_to_image/run_text_to_image_web.sh | ||
``` | ||
|
||
到这里您就可以打开浏览器访问 http://127.0.0.1:8502 地址体验文生图系统服务了。 | ||
|
||
如果安装遇见问题可以查看[FAQ文档](../../FAQ.md) | ||
|
||
## Acknowledge | ||
|
||
我们借鉴了 Deepset.ai [Haystack](https://github.com/deepset-ai/haystack) 优秀的框架设计,在此对[Haystack](https://github.com/deepset-ai/haystack)作者及其开源社区表示感谢。 | ||
|
||
We learn form the excellent framework design of Deepset.ai [Haystack](https://github.com/deepset-ai/haystack), and we would like to express our thanks to the authors of Haystack and their open source community. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# 指定文生图的Yaml配置文件 | ||
unset http_proxy && unset https_proxy | ||
export PIPELINE_YAML_PATH=rest_api/pipeline/text_to_image.yaml | ||
# 使用端口号 8891 启动模型服务 | ||
python rest_api/application.py 8891 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# 配置模型服务地址 | ||
export API_ENDPOINT=http://127.0.0.1:8891 | ||
# 在指定端口 8502 启动 WebUI | ||
python ui/webapp_text_to_image.py --serving_port 8502 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import os | ||
import argparse | ||
|
||
import paddle | ||
from pipelines.nodes import ErnieTextToImageGenerator | ||
from pipelines import TextToImagePipeline | ||
|
||
# yapf: disable | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument("--api_key", default=None, type=str, help="The API Key.") | ||
parser.add_argument("--secret_key", default=None, type=str, help="The secret key.") | ||
parser.add_argument("--prompt_text", default='宁静的小镇', type=str, help="The prompt_text.") | ||
parser.add_argument("--output_dir", default='ernievilg_output', type=str, help="The output path.") | ||
parser.add_argument("--style", default='探索无限', type=str, help="The style text.") | ||
parser.add_argument("--size", default='1024*1024', | ||
choices=['1024*1024', '1024*1536', '1536*1024'], help="Size of the generation images") | ||
parser.add_argument("--topk", default=5, type=int, help="The top k images.") | ||
args = parser.parse_args() | ||
# yapf: enable | ||
|
||
|
||
def text_to_image(): | ||
erine_image_generator = ErnieTextToImageGenerator(ak=args.api_key, | ||
sk=args.secret_key) | ||
pipe = TextToImagePipeline(erine_image_generator) | ||
prediction = pipe.run(query=args.prompt_text, | ||
params={ | ||
"TextToImageGenerator": { | ||
"topk": args.topk, | ||
"style": args.style, | ||
"resolution": args.size, | ||
"output_dir": args.output_dir | ||
} | ||
}) | ||
pipe.save_to_yaml('text_to_image.yaml') | ||
|
||
|
||
if __name__ == "__main__": | ||
text_to_image() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
15 changes: 15 additions & 0 deletions
15
pipelines/pipelines/nodes/text_to_image_generator/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from pipelines.nodes.text_to_image_generator.text_to_image_generator import ErnieTextToImageGenerator |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续看看gradio要不要放到requirments里面
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果以后都是基于Gradio开发,我觉得可以