-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRR] C++ DRR (Declarative Rewrite Rule) of Paddle #55859
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
[DRR] Add Basic Class
[DRR] Add Test Case
[DRR] Fix complile bug
[DRR] Add MatchContext Class
change smart to pointor Replace 'weak_ptr' with pointer modify weak_ptr use count==0 judgment logic Replace the declaration and call of weakptr with pointer
[DRR] Replace 'weak_ptr' with pointer
|
Tensor& Op::operator()(const Tensor& arg1, const Tensor& arg2) const { | ||
std::vector<const Tensor*> inputs{&arg1, &arg2}; | ||
auto& out = pattern_graph_->AddTmpTensor(std::shared_ptr<Tensor>(new Tensor( | ||
prefix + op_type_name_ + "_" + std::to_string(count++), pattern_graph_))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
count是static变量,多线程场景是否会有访问冲突问题?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为thread_local
Refine code format
${PADDLE_SOURCE_DIR}/paddle/fluid/pir/dialect/op_generator/op_creator_drr_gen.py | ||
) | ||
set(op_compat_yaml_file ${PADDLE_SOURCE_DIR}/paddle/phi/api/yaml/op_compat.yaml) | ||
set(op_forward_yaml_file1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前在如下目录中还存在一个 yaml 文件,该 yaml 文件中定义了 pir 单独定义的算子,是否需要纳入该自动生成体系中?:paddle/fluid/pir/dialect/operator/ir/ops.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可能需要,这个我们后面加上~
cc_test_old(pattern_rewrite_test SRCS pattern_rewrite_test.cc DEPS | ||
${PATTERN_REWRITE_TEST_DEPS}) | ||
if(NOT APPLE) | ||
cc_test_old(pattern_rewrite_test SRCS pattern_rewrite_test.cc DEPS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议使用paddle_test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用paddle_test单测链接在win上会有问题
Make count variable as thread_local in Op class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward * add drr_rewrite_pattern.h and remove_redundent_reshape demo * add drr_context and pattern_graph class * add test case * fix cmake file * fix compile bug * fix runtime bug and refine code * add MatchContext * update code * add impl of tensor_interface * fix compile bug * change smart ptr to pointor * change smart to pointor * change smart to pointor * Replace 'weak_ptr' with pointer * modify weak_ptr use count==0 judgment logic * change smart to pointor change smart to pointor Replace 'weak_ptr' with pointer modify weak_ptr use count==0 judgment logic Replace the declaration and call of weakptr with pointer * add match * add match * remove OperationInterface * update * Add Rewrite impl of DrrRewritePattern * refine code * rename ir_value to get in IrValue * fix header include * add CreateOperation template demo * Add GraphTopo class in pattern_graph * Reimplementing the GraphTopo class using queue * Reimplementing the GraphTopo class using queue * Optimize the access method of visited tensor * Considering that the inputs of opcall may be empty * Overloading the operator() method of Op, supporting dual tensor inputs * support attr * 1. Add Op class support for multi input and multi output function. 2. Add DRR duplicate TransposeOp merge testing code * 1. Add transferOP in createOption func * fix bug * fix NotifyOperationRemoved * refine code * Fix axis bug in perm * mupdate share_ptr * update * refine drr_test ut * Modify according to review * modify reshape_op * format code * support vector<int> for attr * fix drr test * refine code * Resolve compilation loop dependencies * add RequireNativeCall * support native_call in drr api * temp tensor prefix fix * refine code * suport Tensor Assgin API in ResultPattern * refine test code * refactor ther drr_pattern class * refine test case * rename DrrPatternBuilder to DrrPatternBase * fix compile bug * adjust include * Add log info in DrrRewritePattern * use ir::get_type_name * use ir::get_type_name * support compute attrbute in drr pattern * refine code * Add fusion testing code for fullOp and expandOp * Standardize code format * Replace IR_THROW() with PADDLE_THROW() * refine code * add attention fuse demo * update * fix compile error * add multihead_matmul fuse pattern * fix multihead_matmul * Update drr_attention_fuse_test.cc add buildprogram * fix drr_attention_fuse_test compile * add fused_gemm_epilogue in drr * attr support std::vector<int64_t> * add debug log * update * fix some bug * fix confilct * support subgraph replace in source pattern graph for drr * Improve the implementation of Drr and multihead_matmul_fuse_pass * add ReorderBlockOpsPass * fix drr_attention_fuse_pass * update * update reorder_block_ops_pass * revert fusedgemm * update * Add Bottom2UpMatch() func * merge code * fix bug * add log & fix bug * refine cpp type trait * using oprand() & num_oprand() replace oprands() * fix conflict * fix compile * fix pd.xxx to pd_op.xxx * fix bug of delete op in drr * add PatternGraphMatchV2 & FindOutputOp func * refactor ir operation creator * fix include pir * fix ir * merging * Split out dfsvisitor func from FindOutputOp func * fix bug * fix output op in source pattern bug * Debugging drr_test drr_attention_fuse_test passed! * Debugging drr_fuse_linear_test passed! * Optimize the PatternGraphMatchV2 function interface and overload the operator= method in MatchContextImpl * Modify comments and function names * auto code-gen for creating ir operation in drr * delete debug log * optimize the interface of MatchFromOutputToInput() * Optimize SourcePatternGraph::OutputNodes judgment logic * polish code * using default operator=() in MatchContextImpl * fix merge conflict * create test case: drr_same_name_test * fix duplicate binding of ir op bug * Rename drr_same_name_test to drr_same_type_binding_test & Add graphical notes * refactor logic of insert point for creating new operation in drr * update * fix compile error * fix some bug * fix codestyle * fix bug * Update anchor node judgment logic * fix bug of link pir * fix codestyle * self review v1 * refine code format * set thread_local for count in op class * fix compile on mac * remove unused .h in value.cc * fix compile --------- Co-authored-by: zyfncg <zhangyunfei07@baidu.com> Co-authored-by: gongshaotian <gstian5555@outlook.com> Co-authored-by: gongshaotian <> Co-authored-by: gongshaotian <141618702+gongshaotian@users.noreply.github.com>
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward * add drr_rewrite_pattern.h and remove_redundent_reshape demo * add drr_context and pattern_graph class * add test case * fix cmake file * fix compile bug * fix runtime bug and refine code * add MatchContext * update code * add impl of tensor_interface * fix compile bug * change smart ptr to pointor * change smart to pointor * change smart to pointor * Replace 'weak_ptr' with pointer * modify weak_ptr use count==0 judgment logic * change smart to pointor change smart to pointor Replace 'weak_ptr' with pointer modify weak_ptr use count==0 judgment logic Replace the declaration and call of weakptr with pointer * add match * add match * remove OperationInterface * update * Add Rewrite impl of DrrRewritePattern * refine code * rename ir_value to get in IrValue * fix header include * add CreateOperation template demo * Add GraphTopo class in pattern_graph * Reimplementing the GraphTopo class using queue * Reimplementing the GraphTopo class using queue * Optimize the access method of visited tensor * Considering that the inputs of opcall may be empty * Overloading the operator() method of Op, supporting dual tensor inputs * support attr * 1. Add Op class support for multi input and multi output function. 2. Add DRR duplicate TransposeOp merge testing code * 1. Add transferOP in createOption func * fix bug * fix NotifyOperationRemoved * refine code * Fix axis bug in perm * mupdate share_ptr * update * refine drr_test ut * Modify according to review * modify reshape_op * format code * support vector<int> for attr * fix drr test * refine code * Resolve compilation loop dependencies * add RequireNativeCall * support native_call in drr api * temp tensor prefix fix * refine code * suport Tensor Assgin API in ResultPattern * refine test code * refactor ther drr_pattern class * refine test case * rename DrrPatternBuilder to DrrPatternBase * fix compile bug * adjust include * Add log info in DrrRewritePattern * use ir::get_type_name * use ir::get_type_name * support compute attrbute in drr pattern * refine code * Add fusion testing code for fullOp and expandOp * Standardize code format * Replace IR_THROW() with PADDLE_THROW() * refine code * add attention fuse demo * update * fix compile error * add multihead_matmul fuse pattern * fix multihead_matmul * Update drr_attention_fuse_test.cc add buildprogram * fix drr_attention_fuse_test compile * add fused_gemm_epilogue in drr * attr support std::vector<int64_t> * add debug log * update * fix some bug * fix confilct * support subgraph replace in source pattern graph for drr * Improve the implementation of Drr and multihead_matmul_fuse_pass * add ReorderBlockOpsPass * fix drr_attention_fuse_pass * update * update reorder_block_ops_pass * revert fusedgemm * update * Add Bottom2UpMatch() func * merge code * fix bug * add log & fix bug * refine cpp type trait * using oprand() & num_oprand() replace oprands() * fix conflict * fix compile * fix pd.xxx to pd_op.xxx * fix bug of delete op in drr * add PatternGraphMatchV2 & FindOutputOp func * refactor ir operation creator * fix include pir * fix ir * merging * Split out dfsvisitor func from FindOutputOp func * fix bug * fix output op in source pattern bug * Debugging drr_test drr_attention_fuse_test passed! * Debugging drr_fuse_linear_test passed! * Optimize the PatternGraphMatchV2 function interface and overload the operator= method in MatchContextImpl * Modify comments and function names * auto code-gen for creating ir operation in drr * delete debug log * optimize the interface of MatchFromOutputToInput() * Optimize SourcePatternGraph::OutputNodes judgment logic * polish code * using default operator=() in MatchContextImpl * fix merge conflict * create test case: drr_same_name_test * fix duplicate binding of ir op bug * Rename drr_same_name_test to drr_same_type_binding_test & Add graphical notes * refactor logic of insert point for creating new operation in drr * update * fix compile error * fix some bug * fix codestyle * fix bug * Update anchor node judgment logic * fix bug of link pir * fix codestyle * self review v1 * refine code format * set thread_local for count in op class * fix compile on mac * remove unused .h in value.cc * fix compile --------- Co-authored-by: zyfncg <zhangyunfei07@baidu.com> Co-authored-by: gongshaotian <gstian5555@outlook.com> Co-authored-by: gongshaotian <> Co-authored-by: gongshaotian <141618702+gongshaotian@users.noreply.github.com>
* fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward * add drr_rewrite_pattern.h and remove_redundent_reshape demo * add drr_context and pattern_graph class * add test case * fix cmake file * fix compile bug * fix runtime bug and refine code * add MatchContext * update code * add impl of tensor_interface * fix compile bug * change smart ptr to pointor * change smart to pointor * change smart to pointor * Replace 'weak_ptr' with pointer * modify weak_ptr use count==0 judgment logic * change smart to pointor change smart to pointor Replace 'weak_ptr' with pointer modify weak_ptr use count==0 judgment logic Replace the declaration and call of weakptr with pointer * add match * add match * remove OperationInterface * update * Add Rewrite impl of DrrRewritePattern * refine code * rename ir_value to get in IrValue * fix header include * add CreateOperation template demo * Add GraphTopo class in pattern_graph * Reimplementing the GraphTopo class using queue * Reimplementing the GraphTopo class using queue * Optimize the access method of visited tensor * Considering that the inputs of opcall may be empty * Overloading the operator() method of Op, supporting dual tensor inputs * support attr * 1. Add Op class support for multi input and multi output function. 2. Add DRR duplicate TransposeOp merge testing code * 1. Add transferOP in createOption func * fix bug * fix NotifyOperationRemoved * refine code * Fix axis bug in perm * mupdate share_ptr * update * refine drr_test ut * Modify according to review * modify reshape_op * format code * support vector<int> for attr * fix drr test * refine code * Resolve compilation loop dependencies * add RequireNativeCall * support native_call in drr api * temp tensor prefix fix * refine code * suport Tensor Assgin API in ResultPattern * refine test code * refactor ther drr_pattern class * refine test case * rename DrrPatternBuilder to DrrPatternBase * fix compile bug * adjust include * Add log info in DrrRewritePattern * use ir::get_type_name * use ir::get_type_name * support compute attrbute in drr pattern * refine code * Add fusion testing code for fullOp and expandOp * Standardize code format * Replace IR_THROW() with PADDLE_THROW() * refine code * add attention fuse demo * update * fix compile error * add multihead_matmul fuse pattern * fix multihead_matmul * Update drr_attention_fuse_test.cc add buildprogram * fix drr_attention_fuse_test compile * add fused_gemm_epilogue in drr * attr support std::vector<int64_t> * add debug log * update * fix some bug * fix confilct * support subgraph replace in source pattern graph for drr * Improve the implementation of Drr and multihead_matmul_fuse_pass * add ReorderBlockOpsPass * fix drr_attention_fuse_pass * update * update reorder_block_ops_pass * revert fusedgemm * update * Add Bottom2UpMatch() func * merge code * fix bug * add log & fix bug * refine cpp type trait * using oprand() & num_oprand() replace oprands() * fix conflict * fix compile * fix pd.xxx to pd_op.xxx * fix bug of delete op in drr * add PatternGraphMatchV2 & FindOutputOp func * refactor ir operation creator * fix include pir * fix ir * merging * Split out dfsvisitor func from FindOutputOp func * fix bug * fix output op in source pattern bug * Debugging drr_test drr_attention_fuse_test passed! * Debugging drr_fuse_linear_test passed! * Optimize the PatternGraphMatchV2 function interface and overload the operator= method in MatchContextImpl * Modify comments and function names * auto code-gen for creating ir operation in drr * delete debug log * optimize the interface of MatchFromOutputToInput() * Optimize SourcePatternGraph::OutputNodes judgment logic * polish code * using default operator=() in MatchContextImpl * fix merge conflict * create test case: drr_same_name_test * fix duplicate binding of ir op bug * Rename drr_same_name_test to drr_same_type_binding_test & Add graphical notes * refactor logic of insert point for creating new operation in drr * update * fix compile error * fix some bug * fix codestyle * fix bug * Update anchor node judgment logic * fix bug of link pir * fix codestyle * self review v1 * refine code format * set thread_local for count in op class * fix compile on mac * remove unused .h in value.cc * fix compile --------- Co-authored-by: zyfncg <zhangyunfei07@baidu.com> Co-authored-by: gongshaotian <gstian5555@outlook.com> Co-authored-by: gongshaotian <> Co-authored-by: gongshaotian <141618702+gongshaotian@users.noreply.github.com>
PR types
New features
PR changes
Others
Description
一、背景与目标
背景
目前Paddle正在进行IR升级工作,其核心目标是设计一套能够由多方共用且优于当前的IR基础设施。引入的新IR会统一整个Paddle体系内的IR表示。
Pass作为对IR进行优化(常量折叠、死代码消除、运算融合等)的关键组件,也需要基于新IR重新进行设计并解决原Pass体系中存在的各种问题。通过对Paddle内现有Pass进行统计分类,发现DAG->DAG PatternRewrite类型的Pass数量占比过半。为了提升用户在新IR上开发Pass的使用体验并且降低后续全量Pass迁移的成本,我们有必要对PatternRewrite这一类型的Pass做进一步的设计优化,使用户通过简单的接口调用即可完成在新IR上的模式匹配替换Pass。
目标
针对DAG->DAG PatternRewrite的Pass场景,提供一套简洁易用,用户开发成本较低的API接口,用来实现在新IR上子图匹配替换Pass。
DRR(Declarative Rewrite Rule) Pass API并不是IR,而是对IR的统一封装,目的是让用户集中在对优化逻辑的处理上,而不需要关心对底层IR的处理。其C++形式示例如下:
二、设计方案
整体概览
DRR Pass是在新IR上,针对DAG->DAG PatternRewrite的Pass场景,提供一套简洁易用,用户开发成本较低的API接口。主要支持的方向包括分布式训练、编译器前端以及推理在新IR上的Pass优化。
主体设计
本方案采用了分层的设计思想,将整个模块从上到下分为了三层:
Others
Pcard-71500