#12355: Support vector of optional tensor and example #12356

hschoi4448 · 2024-09-07T04:33:05Z

Ticket

Problem description

There is currently an issue in ttnn where if you create an op that returns a vector of optional tensors and use register_operation_with_auto_launch_op, the build fails.

The problematic part is in decorator.hpp where only cases for tensor, tensors, and tuple are allowed, causing the static_assert to fail.

https://github.com/tenstorrent/tt-metal/blob/main/ttnn/cpp/ttnn/decorators.hpp#L284

What's changed

Add handling for a vector of optional tensors and create an example that returns a vector of optional tensors.

Checklist

Post commit CI passes
Blackhole Post commit (if applicable)
Model regression CI testing passes (if applicable)
New/Existing tests provide coverage for changes

ayerofieiev-tt · 2024-09-11T02:45:05Z

ttnn/cpp/ttnn/decorators.hpp

+        auto size = value.size();
+        output_tensors.reserve(size);
+
+        auto dummy_tensor = Tensor();


Yeah, this part does not look good to me.

@dmakoviichuk-tt i remember you added a support for a return of vector of optional tensors ~2 months ago. Can you please take a look? This blocks Moreh and id appreciate your feedback here.

I added support for the launch_op and for a few more cases.
Seems like decorators changed a lot since this time.

ttnn/cpp/ttnn/decorators.hpp

ayerofieiev-tt · 2024-09-20T00:48:40Z

This change basically means you return a vector, not vector<optional>, right?
It kind of violates a contract on api. API says that if output is missing - there will be a nullopt.
But this version returns vectors such that every element is at least an empty tensor, which I believe violates the contract.

I think for vector<optional> to properly work with register_operation_with_auto_launch_op we need to do next:

Handle vector in create_async_output_tensors
Update map_execute_on_worker_thread_return_to_launch_op_return
Handle in invoke_composite outputs processing

Infra should be aware about which outputs have to be created. This requires those methods to accept more args.
Also, it is unlikely that we can create a single method which knows how many elements there are in a vector and which ones have to be nullopt/vs not universal for all ops. It means implementation of create_async_output_tensors will have to happen in the operation.

ayerofieiev-tt · 2024-09-20T00:59:44Z

In other words:

If op returns `vector<Tensor>`

Infra does not know how many items to create.
You need to implement create_async_output_tensors in the op.

If op return `vector<optional<Tensor>>`

Infra not only does not know how many tensors to create, but also has no idea which to create and which not.
You'd need to implement create_async_output_tensors in the op, but with current arguments you won't be able to determine which tensors to create and which not.

ayerofieiev-tt · 2024-09-20T01:08:58Z

In other words:

You solution makes things compile, but it changes the op type from the expected vector<optional<Tensor>> op(args) to vector<Tensors> op(args), where some tensors might be empty instead of being nullopt. That, in my opinion, breaks the op contract.

ayerofieiev-tt · 2024-09-20T01:13:58Z

Conclusion

We can't provide a better alternative at the moment.

In most of the migrated PRs you use register_operation, which requires a manual call to launch_op, else an operation host code and dispatch will run in the main thread. This will likely be a regression in behavior and performance for you.

Unless you have time to implement a more proper solution, which I outline here, I am ok if this is merged.

This change should enable you to use register_operation_with_auto_launch_op in all cases.
Leaving approve to not block the migration.

hschoi4448 · 2024-09-20T05:19:15Z

Thank you for the detailed explanation.
I also recognize that there is an issue with this PR, but I don't have time to properly address it right now.

I plan to merge this PR as it is for now, and I would appreciate it if you could make the proper corrections when you have time.

@ayerofieiev-tt @razorback3

hschoi4448 added the moreh moreh contribution label Sep 7, 2024

hschoi4448 force-pushed the hyungsuk/ttnn_example branch from 36cdca1 to 919e8f7 Compare September 7, 2024 04:33

hschoi4448 temporarily deployed to dev September 7, 2024 04:33 — with GitHub Actions Inactive

hschoi4448 marked this pull request as ready for review September 7, 2024 04:33

hschoi4448 requested review from eyonland, patrickroberts, yan-zaretskiy, cfjchu, xanderchin, TT-BrianLiu, ayerofieiev-tt, dmakoviichuk-tt, razorback3 and dongjin-na as code owners September 7, 2024 04:33

hschoi4448 temporarily deployed to dev September 7, 2024 04:45 — with GitHub Actions Inactive

hschoi4448 temporarily deployed to dev September 7, 2024 04:48 — with GitHub Actions Inactive

hschoi4448 temporarily deployed to production September 7, 2024 04:53 — with GitHub Actions Inactive

tenstorrent locked and limited conversation to collaborators Sep 7, 2024

tenstorrent unlocked this conversation Sep 7, 2024

razorback3 approved these changes Sep 8, 2024

View reviewed changes

ayerofieiev-tt reviewed Sep 11, 2024

View reviewed changes

dmakoviichuk-tt reviewed Sep 11, 2024

View reviewed changes

ttnn/cpp/ttnn/decorators.hpp Outdated Show resolved Hide resolved

hschoi4448 force-pushed the hyungsuk/ttnn_example branch from 919e8f7 to 4f89a70 Compare September 11, 2024 08:57

razorback3 added the P1 label Sep 17, 2024

BuiChiTrung mentioned this pull request Sep 18, 2024

#12254: Migrate moreh_adamw operation from tt_eager to ttnn #12258

Merged

4 tasks

ayerofieiev-tt approved these changes Sep 20, 2024

View reviewed changes

hschoi4448 force-pushed the hyungsuk/ttnn_example branch from abbe15e to dcb09ca Compare September 20, 2024 05:13

hschoi4448 added 2 commits September 22, 2024 08:38

#12355: support vector of optional tensor and example

16dadbe

#12355: use range for loop

6c40943

hschoi4448 force-pushed the hyungsuk/ttnn_example branch from dcb09ca to 6c40943 Compare September 21, 2024 23:38

hschoi4448 temporarily deployed to dev September 21, 2024 23:38 — with GitHub Actions Inactive

hschoi4448 temporarily deployed to dev September 21, 2024 23:48 — with GitHub Actions Inactive

hschoi4448 temporarily deployed to dev September 21, 2024 23:49 — with GitHub Actions Inactive

hschoi4448 merged commit 16e9f89 into main Sep 22, 2024
105 checks passed

hschoi4448 deleted the hyungsuk/ttnn_example branch September 22, 2024 01:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#12355: Support vector of optional tensor and example #12356

#12355: Support vector of optional tensor and example #12356

hschoi4448 commented Sep 7, 2024

ayerofieiev-tt Sep 11, 2024

dmakoviichuk-tt Sep 11, 2024

ayerofieiev-tt commented Sep 20, 2024

ayerofieiev-tt commented Sep 20, 2024

ayerofieiev-tt commented Sep 20, 2024

ayerofieiev-tt commented Sep 20, 2024

hschoi4448 commented Sep 20, 2024

#12355: Support vector of optional tensor and example #12356

#12355: Support vector of optional tensor and example #12356

Conversation

hschoi4448 commented Sep 7, 2024

Ticket

Problem description

What's changed

Checklist

ayerofieiev-tt Sep 11, 2024

Choose a reason for hiding this comment

dmakoviichuk-tt Sep 11, 2024

Choose a reason for hiding this comment

ayerofieiev-tt commented Sep 20, 2024

ayerofieiev-tt commented Sep 20, 2024

If op returns vector<Tensor>

If op return vector<optional<Tensor>>

ayerofieiev-tt commented Sep 20, 2024

ayerofieiev-tt commented Sep 20, 2024

Conclusion

hschoi4448 commented Sep 20, 2024

If op returns `vector<Tensor>`

If op return `vector<optional<Tensor>>`