Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cinn schedule error #54983

Merged
merged 18 commits into from
Jul 13, 2023
Merged

Cinn schedule error #54983

merged 18 commits into from
Jul 13, 2023

Conversation

ZzSean
Copy link
Contributor

@ZzSean ZzSean commented Jun 29, 2023

PR types

Others

PR changes

Others

Description

新增Schedule报错处理机制

card-72613

内容:

  • 新增异常处理数据结构。不同schedule primitive通过继承基类IRScheduleErrorHandler,完成不同的报错信息打印
  • 分层级打印报错信息。通过向ScheduleImpl中新增ScheduleErrorMessageLevel,决定后续出现错误时的报错打印层级,可通过运行时的FLAGS控制
  • 使用宏简化代码
  • 该PR以Split为例,展示如何在代码中使用新的报错处理机制
  • 添加异常单测
  • 自定义CINN_THROW,与主框架的PADDLE_THROW报错信息内容保持一致

效果:

  • 之前schedule出错会直接core dumped,可运行schedule单测复现
  • 添加新的异常处理机制后,会抛出异常但不会core dumped,且可以控制报错信息打印层级和报错信息
  • 报错信息如下:
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 cinn::ir::IRSchedule::Split(cinn::ir::Expr const&, std::vector<int, std::allocator<int> > const&)
1 cinn::ir::IRSchedule::Split(cinn::ir::Expr const&, std::vector<cinn::ir::Expr, std::allocator<cinn::ir::Expr> > const&)
2 cinn::ir::enforce::EnforceNotMet::EnforceNotMet(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, int)

----------------------
Error Message Summary:
----------------------
[IRScheduleError] An error occurred in the scheduel primitive <split>. 
[Error info] The params in factors of Split should not be less than -1 or have more than one -1!
(at /root/paddlejob/workspace/work/zhangzheng/Paddle/paddle/cinn/ir/ir_schedule.cc : 179)

@paddle-bot
Copy link

paddle-bot bot commented Jun 29, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhangbo9674
zhangbo9674 previously approved these changes Jul 7, 2023
Copy link
Contributor

@zhangbo9674 zhangbo9674 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for std::cout

paddle/cinn/ir/ir_schedule_error.h Outdated Show resolved Hide resolved
#define CINN_THROW(...) \
do { \
try { \
throw enforce::EnforceNotMet(__VA_ARGS__, __FILE__, __LINE__); \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this same as Paddle enforce? As I remember, we discussed offline that we use PADDLE_THROW when we found the definition, otherwise we define by ourself, then we can handle it with both CINN-only and Paddle-CINN.

Does this implementation also handle both CINN-only and Paddle-CINN? Do we reuse Paddle code?

Copy link
Contributor Author

@ZzSean ZzSean Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PADDLE_THROW is more complicated and has many redundant functions. If we want to reuse Paddle code we have to include more Paddle header files, thus we cannot build CINN-ONLY.
Using CINN_THROW defined by ourselves maybe a good solution for now, 'cause there is IR_THROW in PADDLE_IR to deal with the similar problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, not best but fine to me now.

/**
* \brief Indicates the level of printing error message in the current Schedule
*/
enum class ScheduleErrorMessageLevel : int32_t {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the ErrorMessage use at other places? If so, should we remove Schedule in naming? Then we can use the code you implemented all over the CINN!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically no problem. But its namespace is cinn::ir now, i don't know if it is convenient/necessary to use this in the same level of ir, like hlir, optim, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about changing the name and the location of this file when we have the need later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, can we change the namespace, location of file in this PR? We would like to reuse for other places.

paddle/cinn/ir/ir_schedule_error.h Outdated Show resolved Hide resolved
@@ -160,6 +160,11 @@ DEFINE_int32(cinn_profiler_state,
"Specify the ProfilerState by Int in CINN, 0 for kDisabled, 1 for "
"kCPU, 2 for kCUDA, 3 for kAll, default 0.");

DEFINE_int32(cinn_schedule_error_message_level,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as my other comment, can the ErrorMessage be use at other places? If so, should we remove "schedule" text in the flag? Then we can use the code you implemented all over the CINN!


class InferFactorErrorHandler : public IRScheduleErrorHandler {
public:
explicit InferFactorErrorHandler(const ModuleExpr& module_expr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be complex if we have to implement sub-classes of IRScheduleErrorHandler at where we want to report error?

I thought we have a ErrorHandler class which can set general error message and detailed message, then we just input strings at where we want to report error.

Copy link
Contributor Author

@ZzSean ZzSean Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons:

  • Different kinds of schedule primitive require different error messages, so using abstract class and sub-classes is to deal with their own special need for this.
  • Implementation of error message inside each subclass is easier to use the macro to simplify the code.

Copy link
Contributor

@zhangbo9674 zhangbo9674 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for cout

@zhhsplendid
Copy link
Member

Discussed offline, we approve this PR now, but above unresolved comments will be changed in the future PRs.

Copy link
Member

@zhhsplendid zhhsplendid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZzSean ZzSean merged commit 5f05b22 into PaddlePaddle:develop Jul 13, 2023
cqulilujia pushed a commit to cqulilujia/Paddle that referenced this pull request Jul 24, 2023
* [CINN] Schedule error message optimization

* format code style

* add test

* fix format

* using CINN_THROW and using flags

* optimize error msg

* do not use abtract class of error hanlder

* fix header
wz1qqx pushed a commit to wz1qqx/Paddle that referenced this pull request Jul 31, 2023
* [CINN] Schedule error message optimization

* format code style

* add test

* fix format

* using CINN_THROW and using flags

* optimize error msg

* do not use abtract class of error hanlder

* fix header
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants