New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[PPDiffuers] Add CycleDiffusion based on FastDeploy #4945

Merged

joey12300 merged 17 commits into PaddlePaddle:develop from joey12300:add_fd_cycle_diffusion

Mar 1, 2023

Contributor

joey12300 commented Feb 22, 2023 •

edited

Loading

PR types

New features

PR changes

Models

Description

Add CycleDiffusion based on FastDeploy.

Benchmark

Run CycleDiffusionPipeline 10 times and get average latency to compare the performace between FastDeploy version and pytorch version.

Args	value
batch size	1
num_inference_steps	100
strength	0.8
eta	0.1
guidance_scale	2
source_guidance_scale	1
source_prompt	“An astronaut riding a horse”
prompt	"An astronaut riding an elephant"

Average latency

FastDeploy + PaddleTRT FP16	Diffusers + Torch FP16
6.66s	12.86s


          Add pipeline_fastdeploy_cycle_diffusion.py

b66bc30

paddle-bot bot commented Feb 22, 2023

Thanks for your contribution!

codecov bot commented Feb 22, 2023 •

edited

Loading

Codecov Report

Merging #4945 (c33ed9a) into develop (82bd9b1) will increase coverage by 2.60%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #4945      +/-   ##
===========================================
+ Coverage    46.35%   48.96%   +2.60%     
===========================================
  Files          448      455       +7     
  Lines        64646    66517    +1871     
===========================================
+ Hits         29965    32567    +2602     
+ Misses       34681    33950     -731

Impacted Files	Coverage Δ
paddlenlp/taskflow/pos_tagging.py	`13.88% <0.00%> (-14.38%)`	⬇️
paddlenlp/taskflow/poetry_generation.py	`83.33% <0.00%> (-10.00%)`	⬇️
paddlenlp/taskflow/question_answering.py	`83.33% <0.00%> (-9.53%)`	⬇️
paddlenlp/taskflow/text_generation.py	`18.07% <0.00%> (-8.74%)`	⬇️
paddlenlp/taskflow/word_segmentation.py	`25.67% <0.00%> (-6.47%)`	⬇️
paddlenlp/transformers/reformer/modeling.py	`84.84% <0.00%> (-5.84%)`	⬇️
paddlenlp/taskflow/text_correction.py	`15.49% <0.00%> (-5.56%)`	⬇️
paddlenlp/taskflow/lexical_analysis.py	`14.92% <0.00%> (-5.08%)`	⬇️
...addlenlp/taskflow/models/lexical_analysis_model.py	`25.00% <0.00%> (-4.42%)`	⬇️
paddlenlp/transformers/image_processing_utils.py	`68.23% <0.00%> (-3.86%)`	⬇️
... and 46 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.


          Add cycle diffusion example

94a4113

joey12300 changed the title ~~[PPDiffuers] Add pipeline_fastdeploy_cycle_diffusion.py~~ [PPDiffuers] Add CycleDiffusion based on FastDeploy

joey12300 added 7 commits

February 22, 2023 13:40


          remove cast

b519905


          fix vae encoder

e587855


          Fix pipeline bug

3d80acc


          Add none paddle_stream

323ff31


          Add paddle_stream None

a918be3


          Add synchronize

9684b49


          Fix cycle diffusion

5b2f053

joey12300 marked this pull request as ready for review

February 26, 2023 16:18

joey12300 added 5 commits

February 27, 2023 02:54


          Add benchmark steps

d19389d


          Update cycle diffusion

0d40488


          pd->np

5255fee


          add numpy()

db5d401


          use_trt=False

f2960c3

guoshengCS previously approved these changes

View reviewed changes

ppdiffusers/ppdiffusers/pipelines/stable_diffusion/pipeline_fastdeploy_cycle_diffusion.py Outdated

+                          untruncated_ids = self.tokenizer(prompt, padding="longest", return_tensors="np").input_ids
+                          if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not paddle.equal_all(
+                              text_input_ids, untruncated_ids

Contributor

guoshengCS Feb 28, 2023

text_input_ids和untruncated_ids是np的，这里paddle.equal_all是否需要用np的呢

Contributor Author

joey12300 Feb 28, 2023

感谢review，已经更新了~

ppdiffusers/ppdiffusers/pipelines/stable_diffusion/pipeline_fastdeploy_cycle_diffusion.py

+                              ).prev_sample
+                              if i == len(timesteps) - 1:
+                                  # sync for accuracy it/s measure
+                                  paddle.device.cuda.synchronize()

Contributor

guoshengCS Feb 28, 2023

这里是否是必须的呢，看其他的pipelile包括FastDeploy pipeline里没有这个

Contributor Author

joey12300 Feb 28, 2023

主要保证多流kernel异步发射的最终结果正确，需要同步


          Cast to float32

6feb2b5

joey12300 dismissed guoshengCS’s stale review via

6feb2b5

February 28, 2023 11:01


          np -> pdtensor

c8c1474

jiangjiajun suggested changes

View reviewed changes

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                      if use_fp16:
+                          option.trt_option.enable_fp16 = True
+                      cache_file = os.path.join(model_dir, model_prefix, "inference.trt")
+                      option.set_trt_cache_file(cache_file)

Contributor

jiangjiajun Feb 28, 2023

看到上面的代码已经开始在使用1.0.4的API了， option.paddle_infer_option，那就统一都开始切为新版本API吧

改为option.trt_option.serialize_file = cache_file

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                      option.set_trt_cache_file(cache_file)
+                      # Need to enable collect shape for ernie
+                      if dynamic_shape is not None:
+                          option.enable_paddle_trt_collect_shape()

Contributor

jiangjiajun Feb 28, 2023

option.paddle_infer_option.collect_trt_shape = True

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                          option.enable_paddle_trt_collect_shape()
+                          for key, shape_dict in dynamic_shape.items():
+                              option.set_trt_input_shape(
+                                  key,

Contributor

jiangjiajun Feb 28, 2023

option.trt_option.set_shape(name, min, opt, max)

Contributor Author

joey12300 Feb 28, 2023

done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                  option.use_lite_backend()
+                  if device == "huawei_ascend_npu":
+                      option.use_ascend()
+                      option.set_lite_device_names(["huawei_ascend_npu"])

Contributor

jiangjiajun Feb 28, 2023

此行代码应该无需显式调用，删除即可

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py

+                      option.set_lite_device_names(["huawei_ascend_npu"])
+                      option.set_lite_model_cache_dir(os.path.join(model_dir, model_prefix))
+                      option.set_lite_context_properties(
+                          "HUAWEI_ASCEND_NPU_SELECTED_DEVICE_IDS={};HUAWEI_ASCEND_NPU_PRECISION_MODE=allow_mix_precision".format(

Contributor

jiangjiajun Feb 28, 2023

option.paddle_lite_option.nnadapter_model_cache_dir = os.path.join(model_dir, model_prefix)
option.paddle_lite_option.nnadapter_context_properties = "HUAWEI_ASCEND_NPU_SELECTED_DEVICE_IDS={};HUAWEI_ASCEND_NPU_PRECISION_MODE=allow_mix_precision".format(device_id)

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                  option = fd.RuntimeOption()
+                  option.use_trt_backend()
+                  option.use_gpu(device_id)
+                  option.enable_trt_fp16()

Contributor

jiangjiajun Feb 28, 2023

option.trt_option.enable_fp16 = True

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                  option.use_trt_backend()
+                  option.use_gpu(device_id)
+                  option.enable_trt_fp16()
+                  option.set_trt_max_workspace_size(workspace)

Contributor

jiangjiajun Feb 28, 2023

option.trt_option.max_workspace_size = workspace

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                  option.set_trt_max_workspace_size(workspace)
+                  if dynamic_shape is not None:
+                      for key, shape_dict in dynamic_shape.items():
+                          option.set_trt_input_shape(

Contributor

jiangjiajun Feb 28, 2023

option.set_shape

Contributor Author

joey12300 Feb 28, 2023

Done

ppdiffusers/deploy/text_guided_img_to_img_infer.py Outdated

+                      onnx_file = os.path.join(model_dir, model_prefix, "inference.onnx")
+                      option.set_model_path(onnx_file, model_format=ModelFormat.ONNX)
+                  cache_file = os.path.join(model_dir, model_prefix, "inference.trt")
+                  option.set_trt_cache_file(cache_file)

Contributor

jiangjiajun Feb 28, 2023

option.trt_option.serialize_file = cache_file

Contributor Author

joey12300 Feb 28, 2023

Done


          Update new api

c33ed9a

joey12300 closed this

joey12300 reopened this

guoshengCS approved these changes

View reviewed changes

joey12300 merged commit cd9d29f into PaddlePaddle:develop

joey12300 mentioned this pull request

PaddleNLP 2.5.2 Release Note Candidate #5113

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet