-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#12207: Port moreh_dot to ttnn #12249
Conversation
py::arg("input_tensor_b"), | ||
py::kw_only(), | ||
py::arg("output_dtype") = ttnn::bfloat16, | ||
py::arg("output_mem_config") = operation::DEFAULT_OUTPUT_MEMORY_CONFIG}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
py::arg("output_mem_config") = operation::DEFAULT_OUTPUT_MEMORY_CONFIG}); | |
py::arg("output_memory_config") = std::nullopt; |
DEFAULT_OUTPUT_MEMORY_CONFIG
is an old style that refers to DRAM
and interleaved
memory. In the latest version, the default value is nullopt
, meaning that the input and output will use the same memory_config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most occurrences of output_memory_config show up as DEFAULT_OUTPUT_MEMORY_CONFIG, but that's mainly because they haven't been refactored yet
|
||
namespace ttnn { | ||
constexpr auto moreh_dot = | ||
ttnn::register_operation<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ttnn::register_operation<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>(); | |
ttnn::register_operation_with_auto_launch_op<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>(); |
There seems to be an issue related to the queue in register_operation at the moment. Please use register_operation_with_auto_launch_op instead
auto dst_buffer = output_tensor.buffer(); | ||
float scaler = 1.0f; | ||
|
||
tt::tt_metal::Program program{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tt::tt_metal::Program program{}; | |
Program program{}; |
Since using namespace tt::tt_metal
is declared at the top, I think it should be fine to make this change
namespace tt { | ||
namespace operations { | ||
namespace primary { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namespace tt { | |
namespace operations { | |
namespace primary { | |
namespace ttnn { | |
namespace operations { |
The reason the namespace of moreh_helper was primary
is that, in the past, all moreh ops used the primary
namespace.
Since the moreh ops are no longer in the primary
namespace, moving them to ttnn::operations
would allow us to reduce unnecessary namespace code, as shown below
(std::uint32_t)tt::operations::primary::is_dram(src0_buffer)
->
(std::uint32_t)(is_dram(src0_buffer))
However, if changing the namespace causes a lot of unexpected additional work, it’s not necessary to make this change in this PR
const auto& input_tensor_a = tensor_args.input_tensor_a; | ||
const auto& input_tensor_b = tensor_args.input_tensor_b; | ||
|
||
TT_ASSERT(tt::operations::primary::is_1d_tensor(input_tensor_a)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TT_ASSERT(tt::operations::primary::is_1d_tensor(input_tensor_a)); | |
TT_FATAL(tt::operations::primary::is_1d_tensor(input_tensor_a)); |
I prefer using TT_FATAL
because TT_ASSERT
only works in debug mode.
Actually, since validation is not currently executed in Release mode, it doesn't matter at the moment, but we don't know how things might change in the future.
output_grad = tt_output_grad = torch_output_grad = tt_input_grad = tt_other_grad = None | ||
if require_input_grad or require_other_grad: | ||
output_grad = torch.randint(-2, 3, output_shape, dtype=cpu_dtype) | ||
# tt_output_grad = ttnn.Tensor(output_grad, npu_dtype).pad_to_tile(float("nan")).to(cpu_layout).to(device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code seems to be mistakenly committed during debugging. Please delete it.
|
||
const std::vector<Tensor> input_tensors = {input_tensor_a, input_tensor_b}; | ||
|
||
auto override_runtime_arguments_callback = [reader_kernel_id, writer_kernel_id, compute_kernel_id]( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this code to MorehDotOperation::SingleCore::override_runtime_arguments
and delete.
I have created a new PR here |
5076600
to
5f7ae86
Compare
Ticket
#12207 https://github.com/tenstorrent/tt-metal/issues/12207
Problem description
moreh_dot was deprecated alongside tt_dnn. This PR ports it to ttnn using its operation format.
What's changed
Move device code to ttnn
Create new wrapper code for ttnn with new modules
Checklist