Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PPDiffusers] ppdiffuser LDM weight to original LDM weight script #3809

Merged
merged 8 commits into from
Nov 19, 2022
13 changes: 9 additions & 4 deletions ppdiffusers/examples/text_to_image_laion400m/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,23 @@

本教程带领大家如何开启32层的**Latent Diffusion Model**的训练(支持切换`中文`和`英文`分词器)。

___注意___:
___官方32层`CompVis/ldm-text2im-large-256`的Latent Diffusion Model使用的是vae,而不是vqvae!而Huggingface团队在设计目录结构的时候把文件夹名字错误的设置成了vqvae!为了与Huggingface团队保持一致,我们同样使用了vqvae文件夹命名!___

## 1 本地运行
### 1.1 安装依赖

在运行这个训练代码前,我们需要安装下面的训练依赖。

___注意___:
___当前这部分的代码需要使用develop分支的paddlenlp以及develop分支的ppdiffusers才可以正常运行!!!!___

```bash
# 安装cuda11.2, python 3.7, develop版本的paddle, commit号为b96a21df4e7a42b2445104426e2be407534705e6.
wget https://paddlenlp.bj.bcebos.com/models/community/CompVis/paddlepaddle_gpu-0.0.0.post112-cp37-cp37m-linux_x86_64.whl
pip install paddlepaddle_gpu-0.0.0.post112-cp37-cp37m-linux_x86_64.whl
# 安装指定版本的 paddlenlp 和 ppdiffusers.
pip install paddlenlp==2.4.2 ppdiffusers==0.6.2
pip install -U visualdl fastcore Pillow
# 注意当前该部分的训练需要使用develop分支的paddlenlp和develop分支的ppdiffusers。
pip install -U paddlenlp ppdiffusers visualdl fastcore Pillow
```

### 1.2 准备数据
Expand Down Expand Up @@ -239,7 +244,7 @@ python generate_pipelines.py \
```shell
├── ldm_pipelines # 我们指定的输出文件路径
├── model_index.json # 模型index文件
├── vqvae # vae权重文件夹
├── vqvae # vae权重文件夹!实际是vae模型,文件夹名字与HF保持了一致!
├── model_state.pdparams
├── config.json
├── bert # ldmbert权重文件夹
Expand Down
2 changes: 1 addition & 1 deletion ppdiffusers/examples/text_to_image_laion400m/ldm/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def __init__(self, model_args):

# init vae
vae_name_or_path = model_args.vae_name_or_path if model_args.pretrained_model_name_or_path is None else os.path.join(
model_args.pretrained_model_name_or_path, "vae")
model_args.pretrained_model_name_or_path, "vqvae")
self.vae = AutoencoderKL.from_pretrained(vae_name_or_path)
freeze_params(self.vae.parameters())
logger.info("Freeze vae parameters!")
Expand Down
45 changes: 42 additions & 3 deletions ppdiffusers/examples/text_to_image_laion400m/scripts/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# LDM原版Pytorch权重转换为PPDiffusers权重
# LDM权重转换脚本
本目录下包含了两个脚本文件:
- **convert_orig_ldm_ckpt_to_ppdiffusers.py**: LDM原版Pytorch权重转换为PPDiffusers版LDM权重。
- **convert_ppdiffusers_to_orig_ldm_ckpt.py**: PPDiffusers版的LDM权重转换为原版LDM权重。

## 1. 转换权重
## 1. LDM原版Pytorch权重转换为PPDiffusers版LDM权重
### 1.1 转换权重
假设已经有了原版权重`"ldm_1p4b_init0.ckpt"`
```bash
python convert_orig_ldm_ckpt_to_ppdiffusers.py \
Expand All @@ -9,7 +13,7 @@ python convert_orig_ldm_ckpt_to_ppdiffusers.py \
--original_config_file text2img_L32H1280_unet800M.yaml
```

## 2. 推理预测
### 1.2 推理预测
```python
import paddle
from ppdiffusers import LDMTextToImagePipeline
Expand All @@ -19,3 +23,38 @@ prompt = "a blue tshirt"
image = pipe(prompt, guidance_scale=7.5)[0][0]
image.save("demo.jpg")
```

## PPDiffusers版的LDM权重转换为原版LDM权重
### 1.1 转换权重
假设我们已经使用 `../generate_pipelines.py`生成了`ldm_pipelines`目录。
```shell
├── ldm_pipelines # 我们指定的输出文件路径
├── model_index.json # 模型index文件
├── vqvae # vae权重文件夹!实际是vae模型,文件夹名字与HF保持了一致!
├── model_state.pdparams
├── config.json
├── bert # ldmbert权重文件夹
├── model_config.json
├── model_state.pdparams
├── unet # unet权重文件夹
├── model_state.pdparams
├── config.json
├── scheduler # ddim scheduler文件夹
├── scheduler_config.json
├── tokenizer # bert tokenizer文件夹
├── tokenizer_config.json
├── special_tokens_map.json
├── vocab.txt
```

```bash
python convert_ppdiffusers_to_orig_ldm_ckpt.py \
--model_name_or_path ./ldm_pipelines \
--dump_path ldm_19w.ckpt
```

### 1.2 推理预测
使用`CompVis`[原版txt2img.py](https://github.com/CompVis/latent-diffusion/blob/main/scripts/txt2img.py)脚本生成图片。
```shell
python ./txt2img.py --prompt "a blue t shirt" --ddim_eta 0.0 --n_samples 1 --n_iter 1 --scale 7.5 --ddim_steps 50
```
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
# Copyright 2022 The HuggingFace Inc. team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
Expand Down
Loading