-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contiguous pages support in Reduce Scatter read/write #12477
Conversation
fdcdc8d
to
7abf851
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to run nightly!
@@ -127,24 +127,30 @@ inline void advance_worker_global_page_interleaved ( | |||
|
|||
coord_t const &tensor_shape, // full tensor shape | |||
|
|||
bool &last_page_of_worker | |||
bool &last_page_of_worker, | |||
const uint32_t stride=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this appears to be called only in one place, can we just remove the default value and then also keep last_page_of_worker
last? It's a bit of a nitpick but since last_page_of_worker is purely for debug, it'll be nice to keep it separate
a08c1dc
to
698c505
Compare
698c505
to
77b3fee
Compare
…educe scatter read/write wrapped functions.
77b3fee
to
0f9654d
Compare
Ticket
Problem description
Currently, the read/write chunk functions used in reduce scatter read in pages/tiles one at a time. An optimization is to instead read n contiguous pages until the end of the row, with respect to the dimensions of the shard, tensor slice, and worker slice.
What's changed
The read/write chunk functions have been updated to use the optimized shard tensor addr generators, that return the number of contiguous pages until the end of the shard. Using this, we can now read/write in a contiguous fashion until the end of the row.
Checklist