Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][linalg] unfold projected permutation. #114704

Merged
merged 7 commits into from
Nov 9, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,9 @@ struct DecomposeProjectedPermutation : public OpRewritePattern<GenericOp> {
std::pair<SmallVector<int64_t>, SmallVector<int64_t>>
computeTransposeBroadcast(AffineMap &map) {
assert(map.isProjectedPermutation(false) && "not a projection");

// As the map is a projection it likely operates on a smaller set of
// dimensions as far as the transpose is concerned (rest are broadcast).
int64_t minorSize = map.getNumResults();

SmallVector<int64_t> minorResult;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does minor mean in this context? Apologies if this is obvoius.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments

Expand All @@ -116,13 +119,13 @@ computeTransposeBroadcast(AffineMap &map) {

SmallVector<int64_t> permutation;
if (hasTranspose) {
/// Consider an operand `x : tensor<7x8x9>` of a genericOp that has
/// affine map `affine_map<(d0, d1, d2, d3, d4) -> (d2, d3, d1)>`
/// `x`s access is both transposed and brodcast. But when specifying
/// the `linalg.transpose(x : tensor<7x8x9>)` the dimensions need to be
/// specified as `affine_map<(d0,d1,d2) -> (d1, d2, d0)` instead of
/// refering to d3, d4. Therefore, re-base the transpose dimensions so
/// that they start from d0.
// Consider an operand `x : tensor<7x8x9>` of a genericOp that has
// affine map `affine_map<(d0, d1, d2, d3, d4) -> (d2, d3, d1)>`
// `x`s access is both transposed and broadcast. But when specifying
// the `linalg.transpose(x : tensor<7x8x9>)` the dimensions need to be
// specified as `affine_map<(d0,d1,d2) -> (d1, d2, d0)` instead of
// refering to d3, d4. Therefore, re-base the transpose dimensions so
// that they start from d0.
permutation.resize(minorSize);
std::map<int64_t, int64_t> minorMap;
for (int64_t i = 0; i < minorSize; ++i)
Expand All @@ -147,14 +150,19 @@ LogicalResult DecomposeProjectedPermutation::matchAndRewrite(
op.isSingleYieldOp() || !op.isAllParallelLoops())
return failure();

// All maps need to be projected permutations.
// If the map of an operand is not a `projected permutation` then
// it cannot be decomposed to mere transpose and broadcast.
// The requirement that all maps be `projected permutation` may be
// over-restrictive but since we need to determine shape of the
// iteration space as well, reject if any map violates assumption.
for (auto &opOperand : op->getOpOperands()) {
auto map = op.getMatchingIndexingMap(&opOperand);
if (!map.isProjectedPermutation(false))
return failure();
}

// Currently we handle only static shapes.
// Decomposing linalg.generic involves creating `tensor.empty`
// which cannot have dnyamic shapes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tensor.empty with dynamic shapes are allowed, see e.g.:

%0 = tensor.empty(%sz) : tensor<5x?x6xf32, "foo">

What's a bit "tricky" is computing the required sizes. You could leave that as a TODO.

Copy link
Contributor Author

@javedabsar1 javedabsar1 Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tensor.empty needs to be passed in a runtime value for '?' . For our case i dont see how to derive it easily from the linalg.generic. Guess it could be done by getting the tensor.dim of some operand (not all).... quite a rabbit-hole.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (auto &operand : op->getOpOperands()) {
auto opType = cast<RankedTensorType>(operand.get().getType());
for (auto size : opType.getShape())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

func.func @transpose_and_broadcast(%x : tensor<7x8x9xf32>, %y: tensor<5x9x7x8x10xf32>, %z : tensor<5x9x7x8x10xf32>) -> tensor<5x9x7x8x10xf32> {
%res = linalg.generic
{ indexing_maps = [#projection, #identity, #identity], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"]}
{ indexing_maps = [#projection, #identity, #identity], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"]}
ins(%x, %y : tensor<7x8x9xf32>, tensor<5x9x7x8x10xf32>) outs(%z : tensor<5x9x7x8x10xf32>) {
^bb0(%in: f32, %in_1: f32, %out: f32):
%div = arith.divf %in, %in_1 : f32
Expand Down
Loading