Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ERNIE-ViLG models into Pipelines #3512

Merged
merged 7 commits into from
Oct 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions pipelines/examples/text_to_image/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# ERNIE-ViLG 文生图系统

## 1. 场景概述

ERNIE-ViLG是一个知识增强跨模态图文生成大模型,将文生成图和图生成文任务融合到同一个模型进行端到端的学习,从而实现文本和图像的跨模态语义对齐。可以支持用户进行内容创作,让每个用户都能够体验到一个低门槛的创作平台。更多详细信息请参考官网的介绍[ernieVilg](https://wenxin.baidu.com/moduleApi/ernieVilg)


## 2. 产品功能介绍

本项目提供了低成本搭建端到端文生图的能力。用户需要进行简单的参数配置,然后输入prompts就可以生成各种风格的画作,另外,Pipelines提供了 Web 化产品服务,让用户在本地端就能搭建起来文生图系统。


## 3. 快速开始: 快速搭建文生图系统


### 3.1 运行环境和安装说明

本实验采用了以下的运行环境进行,详细说明如下,用户也可以在自己的环境进行:

a. 软件环境:
- python >= 3.7.0
- paddlenlp >= 2.4.0
- paddlepaddle-gpu >=2.3
- CUDA Version: 10.2
- NVIDIA Driver Version: 440.64.00
- Ubuntu 16.04.6 LTS (Docker)

b. 硬件环境:

- NVIDIA Tesla V100 16GB x4卡
- Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz

c. 依赖安装:
首先需要安装PaddlePaddle,PaddlePaddle的安装请参考文档[官方安装文档](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html),然后安装下面的依赖:
```bash
# pip 一键安装
pip install --upgrade paddle-pipelines -i https://pypi.tuna.tsinghua.edu.cn/simple
# 或者源码进行安装最新版本
cd ${HOME}/PaddleNLP/pipelines/
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
python setup.py install
```
【注意】以下的所有的流程都只需要在`pipelines`根目录下进行,不需要跳转目录;另外,文生图系统需要联网,用户需要在有网的环境下进行。


### 3.2 一键体验文生图系统

在运行下面的命令之前,需要在[ERNIE-ViLG官网](https://wenxin.baidu.com/moduleApi/ernieVilg)申请`API Key`和 `Secret key`两个密钥(需要登录,登录后点击右上角的查看AK/SK,具体如下图),然后执行下面的命令。

<div align="center">
<img src="https://user-images.githubusercontent.com/12107462/196942735-06953270-ce1e-45a5-9e0d-5841068a8464.png" width="500">
</div>


#### 3.2.1 快速一键启动

您可以通过如下命令快速体验文生图系统的效果
```bash
python examples/text_to_image/text_to_image_example.py --prompt_text 宁静的小镇 \
--style 古风 \
--topk 5 \
--api_key 你申请的apikey \
--secret_key 你申请的secretkey \
--output_dir ernievilg_output
```
大概运行一分钟后就可以得到结果了,生成的图片请查看您的输出目录`output_dir`。

### 3.3 构建 Web 可视化文生图系统

整个 Web 可视化文生图系统主要包含 2 大组件: 1. 基于 RestfulAPI 构建模型服务 2. 基于 Gradio 构建 WebUI,接下来我们依次搭建这 2 个服务并最终形成可视化的文生图系统。

#### 3.3.1 启动 RestAPI 模型服务

启动之前,需要把您申请的`API Key`和 `Secret key`两个密钥添加到`text_to_image.yaml`的ak和sk的位置,然后运行:

```bash
export PIPELINE_YAML_PATH=rest_api/pipeline/text_to_image.yaml
# 使用端口号 8891 启动模型服务
python rest_api/application.py 8891
```
Linux 用户推荐采用 Shell 脚本来启动服务::

```bash
sh examples/text_to_image/run_text_to_image.sh
```

#### 3.3.2 启动 WebUI

WebUI使用了[gradio前端](https://gradio.app/),首先需要安装gradio,运行命令如下:
```
pip install gradio
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续看看gradio要不要放到requirments里面

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果以后都是基于Gradio开发,我觉得可以

```
然后使用如下的命令启动:
```bash
# 配置模型服务地址
export API_ENDPOINT=http://127.0.0.1:8891
# 在指定端口 8502 启动 WebUI
python ui/webapp_text_to_image.py --serving_port 8502
```
Linux 用户推荐采用 Shell 脚本来启动服务::

```bash
sh examples/text_to_image/run_text_to_image_web.sh
```

到这里您就可以打开浏览器访问 http://127.0.0.1:8502 地址体验文生图系统服务了。

如果安装遇见问题可以查看[FAQ文档](../../FAQ.md)

## Acknowledge

我们借鉴了 Deepset.ai [Haystack](https://github.com/deepset-ai/haystack) 优秀的框架设计,在此对[Haystack](https://github.com/deepset-ai/haystack)作者及其开源社区表示感谢。

We learn form the excellent framework design of Deepset.ai [Haystack](https://github.com/deepset-ai/haystack), and we would like to express our thanks to the authors of Haystack and their open source community.
19 changes: 19 additions & 0 deletions pipelines/examples/text_to_image/run_text_to_image.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# 指定文生图的Yaml配置文件
unset http_proxy && unset https_proxy
export PIPELINE_YAML_PATH=rest_api/pipeline/text_to_image.yaml
# 使用端口号 8891 启动模型服务
python rest_api/application.py 8891
18 changes: 18 additions & 0 deletions pipelines/examples/text_to_image/run_text_to_image_web.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# 配置模型服务地址
export API_ENDPOINT=http://127.0.0.1:8891
# 在指定端口 8502 启动 WebUI
python ui/webapp_text_to_image.py --serving_port 8502
53 changes: 53 additions & 0 deletions pipelines/examples/text_to_image/text_to_image_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import argparse

import paddle
from pipelines.nodes import ErnieTextToImageGenerator
from pipelines import TextToImagePipeline

# yapf: disable
parser = argparse.ArgumentParser()
parser.add_argument("--api_key", default=None, type=str, help="The API Key.")
parser.add_argument("--secret_key", default=None, type=str, help="The secret key.")
parser.add_argument("--prompt_text", default='宁静的小镇', type=str, help="The prompt_text.")
parser.add_argument("--output_dir", default='ernievilg_output', type=str, help="The output path.")
parser.add_argument("--style", default='探索无限', type=str, help="The style text.")
parser.add_argument("--size", default='1024*1024',
choices=['1024*1024', '1024*1536', '1536*1024'], help="Size of the generation images")
parser.add_argument("--topk", default=5, type=int, help="The top k images.")
args = parser.parse_args()
# yapf: enable


def text_to_image():
erine_image_generator = ErnieTextToImageGenerator(ak=args.api_key,
sk=args.secret_key)
pipe = TextToImagePipeline(erine_image_generator)
prediction = pipe.run(query=args.prompt_text,
params={
"TextToImageGenerator": {
"topk": args.topk,
"style": args.style,
"resolution": args.size,
"output_dir": args.output_dir
}
})
pipe.save_to_yaml('text_to_image.yaml')


if __name__ == "__main__":
text_to_image()
3 changes: 2 additions & 1 deletion pipelines/pipelines/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@
from pipelines.pipelines import Pipeline
from pipelines.pipelines.standard_pipelines import (BaseStandardPipeline,
ExtractiveQAPipeline,
SemanticSearchPipeline)
SemanticSearchPipeline,
TextToImagePipeline)

import pandas as pd

Expand Down
1 change: 1 addition & 0 deletions pipelines/pipelines/nodes/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@
from pipelines.nodes.ranker import BaseRanker, ErnieRanker
from pipelines.nodes.reader import BaseReader, ErnieReader
from pipelines.nodes.retriever import BaseRetriever, DensePassageRetriever
from pipelines.nodes.text_to_image_generator import ErnieTextToImageGenerator
15 changes: 15 additions & 0 deletions pipelines/pipelines/nodes/text_to_image_generator/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from pipelines.nodes.text_to_image_generator.text_to_image_generator import ErnieTextToImageGenerator
Loading