PaddlePaddle · JunnYu · Nov 19, 2022 · Nov 18, 2022 · Nov 18, 2022 · Nov 18, 2022
diff --git a/ppdiffusers/examples/text_to_image_laion400m/README.md b/ppdiffusers/examples/text_to_image_laion400m/README.md
@@ -2,18 +2,23 @@
 
 本教程带领大家如何开启32层的**Latent Diffusion Model**的训练（支持切换`中文`和`英文`分词器）。
 
+___注意___:
+___官方32层`CompVis/ldm-text2im-large-256`的Latent Diffusion Model使用的是vae，而不是vqvae！而Huggingface团队在设计目录结构的时候把文件夹名字错误的设置成了vqvae！为了与Huggingface团队保持一致，我们同样使用了vqvae文件夹命名！___
+
 ## 1 本地运行
 ### 1.1 安装依赖
 
 在运行这个训练代码前，我们需要安装下面的训练依赖。
 
+___注意___:
+___当前这部分的代码需要使用develop分支的paddlenlp以及develop分支的ppdiffusers才可以正常运行！！！！___
+
 ```bash
 # 安装cuda11.2, python 3.7, develop版本的paddle, commit号为b96a21df4e7a42b2445104426e2be407534705e6.
 wget https://paddlenlp.bj.bcebos.com/models/community/CompVis/paddlepaddle_gpu-0.0.0.post112-cp37-cp37m-linux_x86_64.whl
 pip install paddlepaddle_gpu-0.0.0.post112-cp37-cp37m-linux_x86_64.whl
-# 安装指定版本的 paddlenlp 和 ppdiffusers.
-pip install paddlenlp==2.4.2 ppdiffusers==0.6.2
-pip install -U visualdl fastcore Pillow
+# 注意当前该部分的训练需要使用develop分支的paddlenlp和develop分支的ppdiffusers。
+pip install -U paddlenlp ppdiffusers visualdl fastcore Pillow
 ```
 
 ### 1.2 准备数据
@@ -239,7 +244,7 @@ python generate_pipelines.py \
 ```shell
 ├── ldm_pipelines  # 我们指定的输出文件路径
     ├── model_index.json # 模型index文件
-    ├── vqvae # vae权重文件夹
+    ├── vqvae # vae权重文件夹！实际是vae模型，文件夹名字与HF保持了一致！
         ├── model_state.pdparams
         ├── config.json
     ├── bert # ldmbert权重文件夹

diff --git a/ppdiffusers/examples/text_to_image_laion400m/ldm/model.py b/ppdiffusers/examples/text_to_image_laion400m/ldm/model.py
@@ -49,7 +49,7 @@ def __init__(self, model_args):
 
         # init vae
         vae_name_or_path = model_args.vae_name_or_path if model_args.pretrained_model_name_or_path is None else os.path.join(
-            model_args.pretrained_model_name_or_path, "vae")
+            model_args.pretrained_model_name_or_path, "vqvae")
         self.vae = AutoencoderKL.from_pretrained(vae_name_or_path)
         freeze_params(self.vae.parameters())
         logger.info("Freeze vae parameters!")

diff --git a/ppdiffusers/examples/text_to_image_laion400m/scripts/README.md b/ppdiffusers/examples/text_to_image_laion400m/scripts/README.md
@@ -1,6 +1,10 @@
-# LDM原版Pytorch权重转换为PPDiffusers权重
+# LDM权重转换脚本
+本目录下包含了两个脚本文件：
+- **convert_orig_ldm_ckpt_to_ppdiffusers.py**: LDM原版Pytorch权重转换为PPDiffusers版LDM权重。
+- **convert_ppdiffusers_to_orig_ldm_ckpt.py**: PPDiffusers版的LDM权重转换为原版LDM权重。
 
-## 1. 转换权重
+## 1. LDM原版Pytorch权重转换为PPDiffusers版LDM权重
+### 1.1 转换权重
 假设已经有了原版权重`"ldm_1p4b_init0.ckpt"`
 ```bash
 python convert_orig_ldm_ckpt_to_ppdiffusers.py \
@@ -9,7 +13,7 @@ python convert_orig_ldm_ckpt_to_ppdiffusers.py \
     --original_config_file text2img_L32H1280_unet800M.yaml
 ```
 
-## 2. 推理预测
+### 1.2 推理预测
 ```python
 import paddle
 from ppdiffusers import LDMTextToImagePipeline
@@ -19,3 +23,38 @@ prompt = "a blue tshirt"
 image = pipe(prompt, guidance_scale=7.5)[0][0]
 image.save("demo.jpg")
 ```
+
+## PPDiffusers版的LDM权重转换为原版LDM权重
+### 1.1 转换权重
+假设我们已经使用 `../generate_pipelines.py`生成了`ldm_pipelines`目录。
+```shell
+├── ldm_pipelines  # 我们指定的输出文件路径
+    ├── model_index.json # 模型index文件
+    ├── vqvae # vae权重文件夹！实际是vae模型，文件夹名字与HF保持了一致！
+        ├── model_state.pdparams
+        ├── config.json
+    ├── bert # ldmbert权重文件夹
+        ├── model_config.json
+        ├── model_state.pdparams
+    ├── unet # unet权重文件夹
+        ├── model_state.pdparams
+        ├── config.json
+    ├── scheduler # ddim scheduler文件夹
+        ├── scheduler_config.json
+    ├── tokenizer # bert tokenizer文件夹
+        ├── tokenizer_config.json
+        ├── special_tokens_map.json
+        ├── vocab.txt
+```
+
+```bash
+python convert_ppdiffusers_to_orig_ldm_ckpt.py \
+    --model_name_or_path ./ldm_pipelines \
+    --dump_path ldm_19w.ckpt
+```
+
+### 1.2 推理预测
+使用`CompVis`[原版txt2img.py](https://github.com/CompVis/latent-diffusion/blob/main/scripts/txt2img.py)脚本生成图片。
+```shell
+python ./txt2img.py --prompt "a blue t shirt" --ddim_eta 0.0 --n_samples 1 --n_iter 1 --scale 7.5  --ddim_steps 50
+```
diff --git a/ppdiffusers/examples/text_to_image_laion400m/scripts/convert_orig_ldm_ckpt_to_ppdiffusers.py b/ppdiffusers/examples/text_to_image_laion400m/scripts/convert_orig_ldm_ckpt_to_ppdiffusers.py
@@ -1,5 +1,6 @@
 # Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
 # Copyright 2022 The HuggingFace Inc. team.
+#
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at