-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CINN] Add ElinimateCommonFactorOfLocalIndex
pass in OptimizeExprGPU
#62207
[CINN] Add ElinimateCommonFactorOfLocalIndex
pass in OptimizeExprGPU
#62207
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
} | ||
} | ||
|
||
int ExtractNumberFromExpr(const ir::Expr& expr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems some size, index can be int64, is int
enough for cases here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function only handles the factor of ir::Expr, and the ir::Expr is the index of the local variable, so we will rarely encounter int64 cases.
void operator()(ir::Expr* expr) { ir::IRMutator<>::Visit(expr, expr); } | ||
|
||
const std::unordered_map<std::string, std::vector<std::vector<ir::Expr>>>& | ||
local_var_to_indexes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't change the member, should we also mark a suffix const
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
"should at least load and store once."; | ||
for (std::size_t i = 1; i < indexes.size(); ++i) { | ||
CHECK_EQ(indexes[0].size(), indexes[i].size()) | ||
<< "We should guarantee all index vector have the same size."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar
vector -> vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks~
namespace cinn::utils { | ||
|
||
static const std::unordered_set<std::string> | ||
kProhibitScheduleExternalFuncNames = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
变量不要定义在头文件里
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thank~
|
||
const std::unordered_set<std::string>& GetProhibitScheduleExternalFuncNames() { | ||
static const std::unordered_set<std::string> | ||
kProhibitScheduleExternalFuncNames = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
names
只有常量才弄kXXX命名
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
…teCommonFactorOfLocalIndex
PR types
New features
PR changes
Others
Description
pcard-76996
问题背景
对 for 循环做 Reorder 操作时,某些局部变量的读取顺序并非连续的,根据下标索引计算局部变量的空间大小时,部分下标没有充分化简,导致分析出的空间变大,本 PR 通过提供
ElinimateCommonFactorOfLocalIndex
解决此问题LocalAxisVisitor 之后得到的表达式:
由于
(32 * k)
未充分化简,生成的 kernel 代码会变成:解决思路
从 Tensor 的视角出发:
最终效果