#12207: Port moreh_dot to ttnn #12249

thanhnguyen-moreh · 2024-09-05T06:15:29Z

Ticket

#12207 https://github.com/tenstorrent/tt-metal/issues/12207

Problem description

moreh_dot was deprecated alongside tt_dnn. This PR ports it to ttnn using its operation format.

What's changed

Move device code to ttnn
Create new wrapper code for ttnn with new modules

Checklist

Post commit CI passes
Blackhole Post commit (if applicable)
Model regression CI testing passes (if applicable)
New/Existing tests provide coverage for changes

hschoi4448 · 2024-09-05T09:39:42Z

ttnn/cpp/ttnn/operations/moreh/moreh_dot_op/moreh_dot_pybind.cpp

+            py::arg("input_tensor_b"),
+            py::kw_only(),
+            py::arg("output_dtype") = ttnn::bfloat16,
+             py::arg("output_mem_config") = operation::DEFAULT_OUTPUT_MEMORY_CONFIG});


Suggested change

py::arg("output_mem_config") = operation::DEFAULT_OUTPUT_MEMORY_CONFIG});

py::arg("output_memory_config") = std::nullopt;

DEFAULT_OUTPUT_MEMORY_CONFIG is an old style that refers to DRAM and interleaved memory. In the latest version, the default value is nullopt, meaning that the input and output will use the same memory_config.

Most occurrences of output_memory_config show up as DEFAULT_OUTPUT_MEMORY_CONFIG, but that's mainly because they haven't been refactored yet

hschoi4448 · 2024-09-05T09:45:14Z

ttnn/cpp/ttnn/operations/moreh/moreh_dot_op/moreh_dot.hpp

+
+namespace ttnn {
+constexpr auto moreh_dot =
+    ttnn::register_operation<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>();


Suggested change

ttnn::register_operation<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>();

ttnn::register_operation_with_auto_launch_op<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>();

There seems to be an issue related to the queue in register_operation at the moment. Please use register_operation_with_auto_launch_op instead

hschoi4448 · 2024-09-05T09:52:09Z

ttnn/cpp/ttnn/operations/moreh/moreh_dot_op/device/moreh_dot_program_factory.cpp

+    auto dst_buffer = output_tensor.buffer();
+    float scaler = 1.0f;
+
+    tt::tt_metal::Program program{};


Suggested change

tt::tt_metal::Program program{};

Program program{};

Since using namespace tt::tt_metal is declared at the top, I think it should be fine to make this change

hschoi4448 · 2024-09-05T10:00:10Z

ttnn/cpp/ttnn/operations/moreh/moreh_helper_functions.hpp

+namespace tt {
+namespace operations {
+namespace primary {


Suggested change

namespace tt {

namespace operations {

namespace primary {

namespace ttnn {

namespace operations {

The reason the namespace of moreh_helper was primary is that, in the past, all moreh ops used the primary namespace.
Since the moreh ops are no longer in the primary namespace, moving them to ttnn::operations would allow us to reduce unnecessary namespace code, as shown below

(std::uint32_t)tt::operations::primary::is_dram(src0_buffer) -> (std::uint32_t)(is_dram(src0_buffer))

However, if changing the namespace causes a lot of unexpected additional work, it’s not necessary to make this change in this PR

hschoi4448 · 2024-09-05T10:04:18Z

ttnn/cpp/ttnn/operations/moreh/moreh_dot_op/device/moreh_dot_device_operation.cpp

+    const auto& input_tensor_a = tensor_args.input_tensor_a;
+    const auto& input_tensor_b = tensor_args.input_tensor_b;
+
+    TT_ASSERT(tt::operations::primary::is_1d_tensor(input_tensor_a));


Suggested change

TT_ASSERT(tt::operations::primary::is_1d_tensor(input_tensor_a));

TT_FATAL(tt::operations::primary::is_1d_tensor(input_tensor_a));

I prefer using TT_FATAL because TT_ASSERT only works in debug mode.
Actually, since validation is not currently executed in Release mode, it doesn't matter at the moment, but we don't know how things might change in the future.

hschoi4448 · 2024-09-05T10:07:49Z

tests/ttnn/unit_tests/operations/test_moreh_dot.py

+    output_grad = tt_output_grad = torch_output_grad = tt_input_grad = tt_other_grad = None
+    if require_input_grad or require_other_grad:
+        output_grad = torch.randint(-2, 3, output_shape, dtype=cpu_dtype)
+        # tt_output_grad = ttnn.Tensor(output_grad, npu_dtype).pad_to_tile(float("nan")).to(cpu_layout).to(device)


This code seems to be mistakenly committed during debugging. Please delete it.

jeongu-moreh · 2024-09-05T11:25:03Z

ttnn/cpp/ttnn/operations/moreh/moreh_dot_op/device/moreh_dot_program_factory.cpp

+
+    const std::vector<Tensor> input_tensors = {input_tensor_a, input_tensor_b};
+
+    auto override_runtime_arguments_callback = [reader_kernel_id, writer_kernel_id, compute_kernel_id](


Please move this code to MorehDotOperation::SingleCore::override_runtime_arguments and delete.

thanhnguyen-moreh · 2024-09-05T11:32:07Z

I have created a new PR here
#12265
I will make changes on there

thanhnguyen-moreh requested review from eyonland, patrickroberts, yan-zaretskiy, cfjchu, xanderchin, TT-BrianLiu, ayerofieiev-tt, dmakoviichuk-tt, razorback3 and dongjin-na as code owners September 5, 2024 06:15

thanhnguyen-moreh self-assigned this Sep 5, 2024

thanhnguyen-moreh requested review from Aswinmcw and hschoi4448 September 5, 2024 06:16

o2buzzle mentioned this pull request Sep 5, 2024

#12250: port moreh_matmul from tt_dnn to ttnn #12251

Merged

4 tasks

hschoi4448 reviewed Sep 5, 2024

View reviewed changes

jeongu-moreh reviewed Sep 5, 2024

View reviewed changes

tenstorrent#12207: Port moreh_dot to ttnn

5f7ae86

thanhnguyen-moreh force-pushed the move_dot_op branch from 5076600 to 5f7ae86 Compare September 5, 2024 11:45

thanhnguyen-moreh mentioned this pull request Sep 5, 2024

#12207: Port moreh_dot to ttnn #12265

Merged

4 tasks

thanhnguyen-moreh closed this Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#12207: Port moreh_dot to ttnn #12249

#12207: Port moreh_dot to ttnn #12249

thanhnguyen-moreh commented Sep 5, 2024 •

edited

Loading

hschoi4448 Sep 5, 2024

hschoi4448 Sep 5, 2024

hschoi4448 Sep 5, 2024

hschoi4448 Sep 5, 2024

hschoi4448 Sep 5, 2024

hschoi4448 Sep 5, 2024

hschoi4448 Sep 5, 2024

jeongu-moreh Sep 5, 2024

thanhnguyen-moreh commented Sep 5, 2024

	py::arg("output_mem_config") = operation::DEFAULT_OUTPUT_MEMORY_CONFIG});
	py::arg("output_memory_config") = std::nullopt;

	ttnn::register_operation<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>();
	ttnn::register_operation_with_auto_launch_op<"ttnn::moreh_dot", ttnn::operations::moreh::moreh_dot::MorehDot>();

	TT_ASSERT(tt::operations::primary::is_1d_tensor(input_tensor_a));
	TT_FATAL(tt::operations::primary::is_1d_tensor(input_tensor_a));


		const std::vector<Tensor> input_tensors = {input_tensor_a, input_tensor_b};

		auto override_runtime_arguments_callback = [reader_kernel_id, writer_kernel_id, compute_kernel_id](

#12207: Port moreh_dot to ttnn #12249

#12207: Port moreh_dot to ttnn #12249

Conversation

thanhnguyen-moreh commented Sep 5, 2024 • edited Loading

Ticket

Problem description

What's changed

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thanhnguyen-moreh commented Sep 5, 2024

thanhnguyen-moreh commented Sep 5, 2024 •

edited

Loading