Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CINN] Add ElinimateCommonFactorOfLocalIndex pass in OptimizeExprGPU #62207

Merged

Conversation

jiahy0825
Copy link
Contributor

@jiahy0825 jiahy0825 commented Feb 28, 2024

PR types

New features

PR changes

Others

Description

pcard-76996

问题背景

对 for 循环做 Reorder 操作时,某些局部变量的读取顺序并非连续的,根据下标索引计算局部变量的空间大小时,部分下标没有充分化简,导致分析出的空间变大,本 PR 通过提供 ElinimateCommonFactorOfLocalIndex 解决此问题

LocalAxisVisitor 之后得到的表达式:
d81c7f7241eaed9e1bb9fef1e3fb18c8

由于 (32 * k) 未充分化简,生成的 kernel 代码会变成:
996cd4dc241d4c3c56a075619d31544b

解决思路

从 Tensor 的视角出发:

  1. 收集同一个局部变量在 Load 和 Store 时使用的索引值
  2. 提取出来这些索引的最大公约数
  3. 每个索引值除以最大公约数

最终效果

2ce135ba15a25597e5d217984d0289ca 08167157fc19449093408063efcd840d

Copy link

paddle-bot bot commented Feb 28, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

tc20042008
tc20042008 previously approved these changes Feb 29, 2024
}
}

int ExtractNumberFromExpr(const ir::Expr& expr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems some size, index can be int64, is int enough for cases here?

Copy link
Contributor Author

@jiahy0825 jiahy0825 Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function only handles the factor of ir::Expr, and the ir::Expr is the index of the local variable, so we will rarely encounter int64 cases.

void operator()(ir::Expr* expr) { ir::IRMutator<>::Visit(expr, expr); }

const std::unordered_map<std::string, std::vector<std::vector<ir::Expr>>>&
local_var_to_indexes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't change the member, should we also mark a suffix const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

"should at least load and store once.";
for (std::size_t i = 1; i < indexes.size(); ++i) {
CHECK_EQ(indexes[0].size(), indexes[i].size())
<< "We should guarantee all index vector have the same size.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar
vector -> vectors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks~

namespace cinn::utils {

static const std::unordered_set<std::string>
kProhibitScheduleExternalFuncNames = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

变量不要定义在头文件里

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thank~


const std::unordered_set<std::string>& GetProhibitScheduleExternalFuncNames() {
static const std::unordered_set<std::string>
kProhibitScheduleExternalFuncNames = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

names
只有常量才弄kXXX命名

Copy link
Contributor Author

@jiahy0825 jiahy0825 Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

tc20042008
tc20042008 previously approved these changes Mar 4, 2024
@jiahy0825 jiahy0825 merged commit b57a28c into PaddlePaddle:develop Mar 5, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants