Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mesh Virtualization #12719

Merged
merged 2 commits into from
Sep 19, 2024
Merged

Mesh Virtualization #12719

merged 2 commits into from
Sep 19, 2024

Conversation

cfjchu
Copy link
Collaborator

@cfjchu cfjchu commented Sep 16, 2024

Ticket

#10419, #10608, #12479

Problem description

  • This PR holistically fixes a number of issues present in the above tickets. The main problem it solves is trying to create a coherent view of what it means to work with multiple devices using MeshDevice.
  • Currently how devices in a mesh are opened differs across supported systems (N300, T3000, TG, TGG). It's also not robust to changes in the physical device-id assignment in the mesh.

What's changed

  • Introduce a SystemMesh abstraction layer which creates a logical 2D mesh virtualization over physically connected devices. The purpose is to create an abstract representation of the system topology so users of MeshDevice can be transparent to physical device ids, ethernet coordinates.
  • This now significantly simplifies and commonizes our T3000/TG MeshDevice Initialization.
  • This work helps unlock:
    • Migrating all T3000 tests over to Galaxy,
    • stamping multiple MeshDevices(2x4) onto SystemMesh,
    • MeshOp/MeshProgram Infra work
    • CCL Operations with Line topology on T3000 MeshDevice

image

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • New/Existing tests provide coverage for changes

@cfjchu cfjchu force-pushed the jchu/device-virtualization branch 2 times, most recently from 8a4b3c5 to c32bc74 Compare September 19, 2024 07:41
- This fixes #10608, #10419 by adding a logical 2D mesh to remap the
physical mesh onto a flattened 2d grid.
- A logical to physical coordinate translation map is now supplied
per supported system.
@cfjchu cfjchu merged commit 7738225 into main Sep 19, 2024
6 checks passed
@cfjchu cfjchu deleted the jchu/device-virtualization branch September 19, 2024 08:13
tt-aho added a commit that referenced this pull request Sep 21, 2024
tt-aho added a commit that referenced this pull request Sep 23, 2024
tt-aho pushed a commit that referenced this pull request Sep 24, 2024
* #10608: add method to fetch ethernet coordinates from cluster

* #10608: add device virtualization btw MeshDevice and physical system

- This fixes #10608, #10419 by adding a logical 2D mesh to remap the
physical mesh onto a flattened 2d grid.
- A logical to physical coordinate translation map is now supplied
per supported system.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants