-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ppdiffusers] Kandinsky2_2 trainning support #378
Conversation
Thanks for your contribution! |
Tsaiyue seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
ppdiffusers/examples/kandinsky2_2/text_to_image/train_text_to_image_prior_lora.py
Show resolved
Hide resolved
感觉readme里面给使用例子不需要使用 AutoPipelineForText2Image,直接使用对应的pipleine加载就行了 |
我看整体上没有问题了,你这里能测一下所有的在paddle2.6.0 + 这个版本的ppdiffusers能跑通吗? |
Hi, @JunnYu, 在paddle2.6.0下可以顺利跑通四个训练脚本。 同时解决了上述’无法使用pipeline.prior_prior.load_attn_procs()‘的问题,其原因为PriorTransformer需继承UNet2DConditionLoadersMixin,发现这个feature是在这个PR(Revert "【Hackathon 5th No.83】PaddleMIX ppdiffusers models模块功能升级同步HF)下被回滚了。 还有我这个CLA是咋回事来着~,已经签名了却仍然Pending。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
这个CLA可以之后再看一下,确认一下你提交这个pr的跟你实际登录的是同一个github号,比如邮箱要一样啥的。 |
hi, @Tsaiyue
|
## [[Kandinsky2.2 训练支持 · Issue PaddlePaddle#268 · PaddlePaddle/PaddleMIX](https://github.com/PaddlePaddle/PaddleMIX/issues/268)](https://github.com/PaddlePaddle/PaddleMIX/issues/268) ### 1 前200steps loss对齐结果: - decoder w/o LoRA: ![decoder](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/ac52377b-5522-4ffb-8ea8-3ad73668cbc5) - prior w/o LoRA: ![prior](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/af24f7c2-2618-4db0-bdaf-764f72f47c9a) - decoder with LoRA: ![decoder_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/231573c1-9d7c-46da-8b16-592a22d248af) - prior with LoRA: ![prior_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/79c166d9-0a08-48b6-84e0-3c802e857ff9) - decoder finue-tune 3k steps results(prompts: A robot pokemon, 4k photo): ![robot-pokemon](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/a7e8ef2d-08b1-4ef2-80d8-826704340de2) ### 2 其他修改 [[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call](https://github.com/PaddlePaddle/PaddleMIX/blob/ff0d2f25c79cc6e34e7d9c071328a7ed8bea4bc3/ppdiffusers/ppdiffusers/models/attention_processor.py#L789C57-L789C79)] : axis = 1 -> axis = 2 修改原因:运行python train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。 ### 3 对齐说明 - 关闭diffusers和ppdiffusers中dataloader中的shuffle,保证数据顺序一致; - 设置同一随机种子,并将在trainning loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。 ### 4 存在问题 - 在ppdiffusers中使用AutoPipelineForText2Image(args.pretrained_decoder_model_name_or_path)出现组件缺失: ```bash ValueError: Pipeline <class 'ppdiffusers.pipelines.kandinsky2_2.pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> expected {'unet', 'prior_image_processor', 'prior_text_encoder', 'prior_image_encoder', 'movq', 'prior_prior', 'prior_scheduler', 'prior_tokenizer', 'scheduler'}, but only {'unet', 'movq', 'scheduler'} were passed. ``` 只能识别部分组件,无法像diffusers自动识别所有组件。故在提交代码中采取下策:在AutoPipelineForText2Image前逐个定义好后传入,不够简洁。目前原因未定,看到一个[[diffusers的issue](https://github.com/PaddlePaddle/PaddleMIX/compare/%5Bhttps://github.com/huggingface/diffusers/issues/5044)]([https://github.com/huggingface/diffusers/issues/5044)与该问题类似。 - 使用pip install ppdiffusers=0.19.4 在下载prior的LoRA权重时会出现PriorTransformer找不到load_attn_procs, 无法使用pipeline.prior_prior.load_attn_procs(args.output_dir),但使用最新develop分支构建ppdiffusers安装包则不会出现这个问题。 ----------期待回复与关于合入的建议, Thx :)------------------ --------- Co-authored-by: Tsaiyue <tsaiyue01@gamil.com>
[Kandinsky2.2 训练支持 · Issue #268 · PaddlePaddle/PaddleMIX](#268)
1 前200steps loss对齐结果:
2 其他修改
[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call] : axis = 1 -> axis = 2
修改原因:运行python train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。
3 对齐说明
关闭diffusers和ppdiffusers中dataloader中的shuffle,保证数据顺序一致;
设置同一随机种子,并将在trainning loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。
4 存在问题
只能识别部分组件,无法像diffusers自动识别所有组件。故在提交代码中采取下策:在AutoPipelineForText2Image前逐个定义好后传入,不够简洁。目前原因未定,看到一个[diffusers的issue]([https://github.com/huggingface/diffusers/issues/5044)与该问题类似。
----------期待回复与关于合入的建议, Thx :)------------------