Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于模型Fast_ACV_plus模型最后两个sort和一个gather是否有必要? #12

Open
surifans opened this issue Jun 16, 2023 · 1 comment

Comments

@surifans
Copy link

首先感谢您的开源代码!
然后我发现您的Fast_ACV_plus.py这个文件里面在模型最后的两个sort和一个gather是不是可以省略掉以节省时间,代码部分如下:
att_weights_prob = F.softmax(att_weights, dim=2)
_, ind = att_weights_prob.sort(2, True)
k = 24
ind_k = ind[:, :, :k]
ind_k = ind_k.sort(2, False)[0]
att_topk = torch.gather(att_weights_prob, 2, ind_k)

上面操作中先对att_weights_prob进行排序,得到下标矩阵ind,然后再对下标矩阵的一部分ind_k进行升序排列,现在ind_k里面的对应维度的数值变成了0,1,2,3,4,5,6······,再去通过gather去取att_weights_prob中的数值,如果k等于att_weights_prob的第2个维度的值,而不是24,那att_topk就等于att_weights_prob吧?如果把k的值设成att_weights的dim=2的维度值,那么这两个sort和gather是否就完全没必要存在了呢?

@gangweiX
Copy link
Owner

如果你把k设成att_weights的dim=2的维度值(也就是48),那确实没必要两个sort和gather,这里的sort和gather主要是为了构建一个topk的attention volume,构建一个稀疏的代价体,降低后面代价聚合的计算开销

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants