Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got compilation working on Windows. #4420

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

#include <memory>
#include <optional>
#include <string>

namespace mlir {

Expand Down
27 changes: 24 additions & 3 deletions lib/Analysis/Utility.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,29 @@
namespace mlir {
namespace {

using namespace triton;
using namespace triton::gpu;
// We only "import" the symbols that we need to avoid name conflicts.
using triton::AdvanceOp;
using triton::LinearLayout;
using triton::MakeTensorPtrOp;
using triton::gpu::getCTALayout;
using triton::gpu::getCTAsPerCGA;
using triton::gpu::getCTASplitNum;
using triton::gpu::MmaEncodingTrait;
using triton::gpu::getNumCTAs;
using triton::gpu::getOrder;
using triton::gpu::getShapePerCTA;
using triton::gpu::getThreadsPerWarp;
using triton::gpu::getThreadsPerWarpWithUniqueData;
using triton::gpu::getUniqueContigPerThread;
using triton::gpu::getWarpsPerCTA;
using triton::gpu::getWarpsPerCTAWithUniqueData;
using triton::gpu::toLinearLayout;
using triton::gpu::AMDMfmaEncodingAttr;
using triton::gpu::BlockedEncodingAttr;
using triton::gpu::DotOperandEncodingAttr;
using triton::gpu::NvidiaMmaEncodingAttr;
using triton::gpu::SliceEncodingAttr;
using triton::gpu::TritonGPUDialect;

int getParentAxis(Attribute layout, int axis) {
if (auto sliceEncoding = dyn_cast<SliceEncodingAttr>(layout)) {
Expand Down Expand Up @@ -514,7 +535,7 @@ bool supportMMA(triton::DotOp op, int version) {
}
}
if (aElemTy.isF32() && bElemTy.isF32()) {
return op.getInputPrecision() == InputPrecision::TF32 && version >= 2;
return op.getInputPrecision() == triton::InputPrecision::TF32 && version >= 2;
}
return supportMMA(op.getA(), version) && supportMMA(op.getB(), version);
}
Expand Down
14 changes: 10 additions & 4 deletions lib/Dialect/TritonGPU/Transforms/Utility.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,13 @@

namespace mlir {

using namespace triton;
// We only "import" the symbols that we need to avoid name conflicts.
using triton::AxisInfo;
using triton::DialectInferLayoutInterface;
using triton::JoinOp;
using triton::ModuleAxisInfoAnalysis;
using triton::PointerType;
using triton::SplitOp;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do those names conflict with? I would assume using using namespace triton; should be enough?

Copy link
Contributor Author

@eaplatanios eaplatanios Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names do not conflict. It's other stuff in the triton namespace that conflict. For example, there exist both mlir::detail and mlir::triton::detail iirc as one example and those conflicts were causing the build to fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it look like to make those explicit instead? (I don't see detail being used)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure I understand but I'm also not very familiar with C++ so maybe I'm missing something. My understanding of the issue I ran into is the following. The code previously looked something like this:

namespace mlir {
  // At this point the `mlir` already contains a symbol named `detail` coming from the MLIR codebase via the headers.

  // The following line brings all symbols from within `mlir::triton` in the current scope. That namespace also contains
  // a symbol named `detail`. That symbol now conflicts with `mlir::detail` that is already in scope. Further down in the
  // code when the compiler tries to prove a type expression in a SFINAE bound, it fails because of this ambiguity.
  using namespace triton;
}

In this PR, instead of importing all symbols from the mlir::triton namespace, I switched to only importing the ones that are used in this file, thus avoiding the name conflict/ambiguity. I'm actually not sure why this does not raise an error with Clang, but then again I've noticed how C++ compilers tend to not be super consistent with the standards sometimes and I'm not actually sure what the correct behavior would be here. This change makes it work for both compilers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I'm asking is can we make the conflicting names explicit instead? There would be conflict only when trying to use a symbol that would exist in both namespace. I'm not sure I understand where that happens and why we cannot just make those names explicit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way to repro without MSVC? I'm willing to take a look but don't have MSVC locally :')

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am tkaing a look and noticing we already use fully qualified names in some places despite the using namespace, and a lot of these symbols have a single use so using triton::symbol is definitely too bulky. Would you mind modifying this PR to remove using triton:: and use fully qualified names where appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah definitely. I was thinking of doing that originally but wasn't sure what you'd prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding reproducing without MSVC, I looked a bit into whether there's any clang flag or something we could flip to reproduce the stricter behavior but I couldn't find anything. :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptillet I replaced all references with fully qualified names and removed the using statements.


SmallVector<unsigned, 3> mmaVersionToInstrShape(int version,
const ArrayRef<int64_t> &shape,
Expand Down Expand Up @@ -443,7 +449,7 @@ std::optional<Attribute> inferSrcEncoding(Operation *op, Attribute encoding) {
op->hasTrait<mlir::OpTrait::SameLoadStoreOperandsAndResultEncoding>() ||
op->hasTrait<mlir::OpTrait::Elementwise>() ||
isa<scf::WhileOp, scf::YieldOp, scf::ConditionOp,
nvidia_gpu::WarpGroupDotWaitOp>(op)) {
triton::nvidia_gpu::WarpGroupDotWaitOp>(op)) {
return encoding;
}

Expand Down Expand Up @@ -472,7 +478,7 @@ std::optional<Attribute> inferDstEncoding(Operation *op, Attribute encoding) {
op->hasTrait<mlir::OpTrait::SameLoadStoreOperandsAndResultEncoding>() ||
op->hasTrait<mlir::OpTrait::Elementwise>() ||
isa<scf::WhileOp, scf::ForOp, scf::YieldOp, scf::ConditionOp,
nvidia_gpu::WarpGroupDotWaitOp>(op))
triton::nvidia_gpu::WarpGroupDotWaitOp>(op))
return encoding;
if (auto reduceOp = dyn_cast<triton::ReduceOp>(op))
return inferDstEncoding(reduceOp, encoding);
Expand Down Expand Up @@ -824,7 +830,7 @@ Value linearize(OpBuilder &b, Location loc, ArrayRef<Value> multiDim,
}

bool isPureUnaryInlineAsm(Operation *op) {
auto inlineAsmOp = dyn_cast<ElementwiseInlineAsmOp>(op);
auto inlineAsmOp = dyn_cast<triton::ElementwiseInlineAsmOp>(op);
if (!inlineAsmOp)
return false;
return op->getNumOperands() == 1 && op->getNumResults() == 1 &&
Expand Down
5 changes: 5 additions & 0 deletions lib/Tools/LinearLayout.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,12 @@ int32_t LinearLayout::getNumConsecutiveInOut() const {
}
}
}

#if defined(_MSC_VER)
int32_t trailingZeros = otherBits != 0 ? _tzcnt_u32(otherBits) : 31;
#else
int32_t trailingZeros = otherBits != 0 ? __builtin_ctz(otherBits) : 31;
#endif

return 1 << std::min(consec, trailingZeros);
}
Expand Down