关于模型Fast_ACV_plus模型最后两个sort和一个gather是否有必要？ #12

surifans · 2023-06-16T02:18:15Z

首先感谢您的开源代码！
然后我发现您的Fast_ACV_plus.py这个文件里面在模型最后的两个sort和一个gather是不是可以省略掉以节省时间，代码部分如下：
att_weights_prob = F.softmax(att_weights, dim=2)
_, ind = att_weights_prob.sort(2, True)
k = 24
ind_k = ind[:, :, :k]
ind_k = ind_k.sort(2, False)[0]
att_topk = torch.gather(att_weights_prob, 2, ind_k)

上面操作中先对att_weights_prob进行排序，得到下标矩阵ind，然后再对下标矩阵的一部分ind_k进行升序排列，现在ind_k里面的对应维度的数值变成了0，1，2，3，4，5，6······，再去通过gather去取att_weights_prob中的数值，如果k等于att_weights_prob的第2个维度的值，而不是24，那att_topk就等于att_weights_prob吧？如果把k的值设成att_weights的dim=2的维度值，那么这两个sort和gather是否就完全没必要存在了呢？

gangweiX · 2023-06-16T15:04:29Z

如果你把k设成att_weights的dim=2的维度值（也就是48），那确实没必要两个sort和gather，这里的sort和gather主要是为了构建一个topk的attention volume，构建一个稀疏的代价体，降低后面代价聚合的计算开销

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于模型Fast_ACV_plus模型最后两个sort和一个gather是否有必要？ #12

关于模型Fast_ACV_plus模型最后两个sort和一个gather是否有必要？ #12

surifans commented Jun 16, 2023

gangweiX commented Jun 16, 2023

关于模型Fast_ACV_plus模型最后两个sort和一个gather是否有必要？ #12

关于模型Fast_ACV_plus模型最后两个sort和一个gather是否有必要？ #12

Comments

surifans commented Jun 16, 2023

gangweiX commented Jun 16, 2023