[CustomOp] Polish custom api content for performance in dygraph #32209
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Performance optimization
PR changes
Others
Describe
目前自定义Op不需要用户自己封装Python API,而是采用了自动生成的技术,而自定义op自动生产API的时候,内部调用仍然使用的是以前append_op的那一套,而不是core.ops,这一方面是为了动静兼容,另一方面也是因为在自定义op的体系中没有对应的core.ops API可以调用,由于append_op调用栈比较深,会导致API在动态图下的调用性能差一些,这个是动态图之前已知的问题
这个PR简化了自动生成API在动图下的Python调用栈,以提高自定义OP API在动态图下的执行性能,简要测试如下:
1. 测试条件
2. 测试数据
本PR优化后,仍有的时间差(大概在10us左右)基本是自定义Op机制本身引入的开销: