add cinn_launch_op for using CINN to optimize graph #36600

CtfGo · 2021-10-21T05:41:50Z

PR types

New features

PR changes

OPs

Describe

增加CinnLaunchOp，负责执行Cinn子图编译的结果，要点如下：

在子图划分的BuildCinnPass中，每个子图在原图中会被替换为该CinnLaunchOp，由它来调用Cinn进行子图编译、执行的功能。
CinnLaunchOp的输入/输出即为子图的输入和输出，另外增加compilation_key属性，它可由该属性key从全局Cache中获取子图对象、编译结果，该属性由BuildCinnPass在创建Op时进行设置
CinnLaunchOp功能实现的流程为：
- 从全局Cache中获取子图对象
- 从全局Cache中获取子图编译结果，未命中cache时进行即时编译
- 根据编译结果的变量信息(数据类型、shape）初始化运行时数据，分配内存/显存
- 将运行时数据打包为参数，调用cinn的可执行对象runtime program进行计算
- 子图运行结果通过参数指针同步到paddle侧的tensor

paddle-bot-old · 2021-10-21T05:41:53Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… cinn_lanuch_op

zhhsplendid · 2021-10-26T10:37:41Z

paddle/fluid/operators/CMakeLists.txt

-register_operators(EXCLUDES py_layer_op py_func_op warpctc_op dgc_op load_combine_op lstm_op run_program_op eye_op 
-        recurrent_op save_combine_op sparse_attention_op sync_batch_norm_op spectral_op ${OP_MKL_DEPS} DEPS ${OP_HEADER_DEPS})
+register_operators(EXCLUDES py_layer_op py_func_op warpctc_op dgc_op load_combine_op lstm_op run_program_op eye_op
+        recurrent_op save_combine_op sparse_attention_op sync_batch_norm_op spectral_op cinn_launch_op ${OP_MKL_DEPS} DEPS ${OP_HEADER_DEPS})


Should cinn_launch_op also be guarded by the WITH_CINN flag?

Not necessary, here it is declared as an excluded op with the EXCLUDES keyword of register_operators, like sync_batch_norm_op has the similar definition.

zhhsplendid · 2021-10-26T11:10:25Z

paddle/fluid/operators/cinn_launch_op.cc

+                                      &name2argument, &hold_buffers);
+
+    // prepare output variables
+    auto output_tensors = details::GetConstTensors(scope, Outputs(kOutputs));


I would suggest to replace the auto of this line to actual type because when I read GetConstTensors text, I thought it is tensor. After that, I don't know it is map<string, const LodTensor*> until I read the code of GetConstTensors.

To increase readability, I suggest to put actual types instead of auto when you emphasize the type or the type may confuse readers.

Agree, I changed some auto usage to their actual type.

zhhsplendid · 2021-10-26T11:10:54Z

paddle/fluid/operators/CMakeLists.txt

-register_operators(EXCLUDES py_layer_op py_func_op warpctc_op dgc_op load_combine_op lstm_op run_program_op eye_op 
-        recurrent_op save_combine_op sparse_attention_op sync_batch_norm_op spectral_op ${OP_MKL_DEPS} DEPS ${OP_HEADER_DEPS})
+register_operators(EXCLUDES py_layer_op py_func_op warpctc_op dgc_op load_combine_op lstm_op run_program_op eye_op
+        recurrent_op save_combine_op sparse_attention_op sync_batch_norm_op spectral_op cinn_launch_op ${OP_MKL_DEPS} DEPS ${OP_HEADER_DEPS})


Should we also guard cinn_launch_op by WITH_CINN?

zhhsplendid · 2021-10-26T11:12:14Z

paddle/fluid/operators/cinn_launch_op.cc

+Both input and output of this operator are a set of variables
+which are input and output of the graph respectively that will be
+compiled and executed in this operator.
+In addition, there is a attribute named 'compilation_key' should be


"there is a attribute" -> "there is an attribute"

zhhsplendid · 2021-10-26T11:33:27Z

paddle/fluid/operators/cinn_launch_op_helper.h

+namespace paddle {
+namespace operators {
+
+using LoDTensor = framework::LoDTensor;


I highly recommend not to use using declaration in a header file, even if it is in a namespace. Same comment for other places.

See reference from:
https://abseil.io/tips/119
https://stackoverflow.com/questions/6175705/scope-of-using-declaration-within-a-namespace
or you can find a lot of references online about it.

Agree, all using declaration in header file were removed to source file.

… cinn_lanuch_op

zhhsplendid

LGTM

thisjiang

LGTM

… cinn_lanuch_op

zhhsplendid

LGTM

zhhsplendid

LGTM

Xreki

LGTM for op benchmark ci. 是否考虑将cinn op相关的代码单独放一个子目录？

wzzju · 2021-11-01T02:21:53Z

paddle/fluid/operators/CMakeLists.txt

+  if (WITH_GPU)
+      nv_test(cinn_launch_op_test SRCS cinn_launch_op_test.cc DEPS cinn_compiler cinn_launch_op elementwise_add_op)
+  endif()


只支持GPU的test的吗？CPU的Test不需要吗？

wzzju · 2021-11-01T02:40:04Z

paddle/fluid/operators/cinn_launch_op.cc

+set necessarily to get corresponding ir::Graph object of the graph
+or its computation result.
+
+It accomplishs the computation of graph following several steps:


accomplishs --> accomplishes

wzzju · 2021-11-01T03:18:27Z

paddle/fluid/operators/cinn_launch_op.h

+    const auto& cinn_runtime_program = cinn_compiled_object.runtime_program;
+    const auto& compiled_scope = *(cinn_compiled_object.scope.get());


Suggested change

const auto& cinn_runtime_program = cinn_compiled_object.runtime_program;

const auto& compiled_scope = *(cinn_compiled_object.scope.get());

const auto& cinn_runtime_program = *(cinn_compiled_object.runtime_program);

const auto& compiled_scope = *(cinn_compiled_object.scope);

wzzju · 2021-11-01T03:36:45Z