Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PPDiffuers] Add CycleDiffusion based on FastDeploy #4945

Merged
merged 17 commits into from
Mar 1, 2023

Conversation

joey12300
Copy link
Contributor

@joey12300 joey12300 commented Feb 22, 2023

PR types

New features

PR changes

Models

Description

Add CycleDiffusion based on FastDeploy.

Benchmark

Run CycleDiffusionPipeline 10 times and get average latency to compare the performace between FastDeploy version and pytorch version.

Args value
batch size 1
num_inference_steps 100
strength 0.8
eta 0.1
guidance_scale 2
source_guidance_scale 1
source_prompt “An astronaut riding a horse”
prompt "An astronaut riding an elephant"

Average latency

FastDeploy + PaddleTRT FP16 Diffusers + Torch FP16
6.66s 12.86s

@paddle-bot
Copy link

paddle-bot bot commented Feb 22, 2023

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Feb 22, 2023

Codecov Report

Merging #4945 (c33ed9a) into develop (82bd9b1) will increase coverage by 2.60%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #4945      +/-   ##
===========================================
+ Coverage    46.35%   48.96%   +2.60%     
===========================================
  Files          448      455       +7     
  Lines        64646    66517    +1871     
===========================================
+ Hits         29965    32567    +2602     
+ Misses       34681    33950     -731     
Impacted Files Coverage Δ
paddlenlp/taskflow/pos_tagging.py 13.88% <0.00%> (-14.38%) ⬇️
paddlenlp/taskflow/poetry_generation.py 83.33% <0.00%> (-10.00%) ⬇️
paddlenlp/taskflow/question_answering.py 83.33% <0.00%> (-9.53%) ⬇️
paddlenlp/taskflow/text_generation.py 18.07% <0.00%> (-8.74%) ⬇️
paddlenlp/taskflow/word_segmentation.py 25.67% <0.00%> (-6.47%) ⬇️
paddlenlp/transformers/reformer/modeling.py 84.84% <0.00%> (-5.84%) ⬇️
paddlenlp/taskflow/text_correction.py 15.49% <0.00%> (-5.56%) ⬇️
paddlenlp/taskflow/lexical_analysis.py 14.92% <0.00%> (-5.08%) ⬇️
...addlenlp/taskflow/models/lexical_analysis_model.py 25.00% <0.00%> (-4.42%) ⬇️
paddlenlp/transformers/image_processing_utils.py 68.23% <0.00%> (-3.86%) ⬇️
... and 46 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@joey12300 joey12300 changed the title [PPDiffuers] Add pipeline_fastdeploy_cycle_diffusion.py [PPDiffuers] Add CycleDiffusion based on FastDeploy Feb 22, 2023
@joey12300 joey12300 marked this pull request as ready for review February 26, 2023 16:18
guoshengCS
guoshengCS previously approved these changes Feb 28, 2023
untruncated_ids = self.tokenizer(prompt, padding="longest", return_tensors="np").input_ids

if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not paddle.equal_all(
text_input_ids, untruncated_ids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

text_input_idsuntruncated_ids是np的,这里paddle.equal_all是否需要用np的呢

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢review,已经更新了~

).prev_sample
if i == len(timesteps) - 1:
# sync for accuracy it/s measure
paddle.device.cuda.synchronize()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否是必须的呢,看其他的pipelile包括FastDeploy pipeline里没有这个

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

主要保证多流kernel异步发射的最终结果正确,需要同步

if use_fp16:
option.trt_option.enable_fp16 = True
cache_file = os.path.join(model_dir, model_prefix, "inference.trt")
option.set_trt_cache_file(cache_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看到上面的代码已经开始在使用1.0.4的API了, option.paddle_infer_option,那就统一都开始切为新版本API吧

改为option.trt_option.serialize_file = cache_file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

option.set_trt_cache_file(cache_file)
# Need to enable collect shape for ernie
if dynamic_shape is not None:
option.enable_paddle_trt_collect_shape()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.paddle_infer_option.collect_trt_shape = True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

option.enable_paddle_trt_collect_shape()
for key, shape_dict in dynamic_shape.items():
option.set_trt_input_shape(
key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.trt_option.set_shape(name, min, opt, max)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

option.use_lite_backend()
if device == "huawei_ascend_npu":
option.use_ascend()
option.set_lite_device_names(["huawei_ascend_npu"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此行代码应该无需显式调用,删除即可

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

option.set_lite_device_names(["huawei_ascend_npu"])
option.set_lite_model_cache_dir(os.path.join(model_dir, model_prefix))
option.set_lite_context_properties(
"HUAWEI_ASCEND_NPU_SELECTED_DEVICE_IDS={};HUAWEI_ASCEND_NPU_PRECISION_MODE=allow_mix_precision".format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.paddle_lite_option.nnadapter_model_cache_dir = os.path.join(model_dir, model_prefix)
option.paddle_lite_option.nnadapter_context_properties = "HUAWEI_ASCEND_NPU_SELECTED_DEVICE_IDS={};HUAWEI_ASCEND_NPU_PRECISION_MODE=allow_mix_precision".format(device_id)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

option = fd.RuntimeOption()
option.use_trt_backend()
option.use_gpu(device_id)
option.enable_trt_fp16()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.trt_option.enable_fp16 = True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

option.use_trt_backend()
option.use_gpu(device_id)
option.enable_trt_fp16()
option.set_trt_max_workspace_size(workspace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.trt_option.max_workspace_size = workspace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

option.set_trt_max_workspace_size(workspace)
if dynamic_shape is not None:
for key, shape_dict in dynamic_shape.items():
option.set_trt_input_shape(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.set_shape

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

onnx_file = os.path.join(model_dir, model_prefix, "inference.onnx")
option.set_model_path(onnx_file, model_format=ModelFormat.ONNX)
cache_file = os.path.join(model_dir, model_prefix, "inference.trt")
option.set_trt_cache_file(cache_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.trt_option.serialize_file = cache_file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@joey12300 joey12300 closed this Mar 1, 2023
@joey12300 joey12300 reopened this Mar 1, 2023
@joey12300 joey12300 merged commit cd9d29f into PaddlePaddle:develop Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants