Skip to content

Commit

Permalink
[SYCL][L0] Add an env var to control buffer migration in context with…
Browse files Browse the repository at this point in the history
… single root-device (#7178)

Changed the default to perform regular migration, as opposed to have
single root-device allocation to serve all [sub-]devices.
Signed-off-by: Sergey V Maslov <sergey.v.maslov@intel.com>

Signed-off-by: Sergey V Maslov <sergey.v.maslov@intel.com>
  • Loading branch information
smaslov-intel authored Oct 25, 2022
1 parent dcbed11 commit bd03e0d
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 1 deletion.
1 change: 1 addition & 0 deletions sycl/doc/EnvironmentVariables.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,7 @@ variables in production code.</span>
| `SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS` | Integer | When set to a positive value enables use of Level Zero immediate commandlists, which means there is no batching and all commands are immediately submitted for execution. Default is 0. Note: When immediate commandlist usage is enabled it is necessary to also set SYCL_PI_LEVEL_ZERO_DEVICE_SCOPE_EVENTS to either 0 or 1. |
| `SYCL_PI_LEVEL_ZERO_USE_MULTIPLE_COMMANDLIST_BARRIERS` | Integer | When set to a positive value enables use of multiple Level Zero commandlists when submitting barriers. Default is 1. |
| `SYCL_PI_LEVEL_ZERO_USE_COPY_ENGINE_FOR_FILL` | Integer | When set to a positive value enables use of a copy engine for memory fill operations. Default is 0. |
| `SYCL_PI_LEVEL_ZERO_SINGLE_ROOT_DEVICE_BUFFER_MIGRATION` | Integer | When set to "0" tells to use single root-device allocation for all devices in a context where all devices have same root. Otherwise performs regular buffer migration. Default is 1. |

## Debugging variables for CUDA Plugin

Expand Down
21 changes: 20 additions & 1 deletion sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8812,10 +8812,29 @@ pi_result _pi_buffer::getZeHandle(char *&ZeHandle, access_mode_t AccessMode,
LastDeviceWithValidAllocation = Device;
return PI_SUCCESS;
}
// Reads user setting on how to deal with buffers in contexts where
// all devices have the same root-device. Returns "true" if the
// preference is to have allocate on each [sub-]device and migrate
// normally (copy) to other sub-devices as needed. Returns "false"
// if the preference is to have single root-device allocations
// serve the needs of all [sub-]devices, meaning potentially more
// cross-tile traffic.
//
static const bool SingleRootDeviceBufferMigration = [] {
const char *EnvStr =
std::getenv("SYCL_PI_LEVEL_ZERO_SINGLE_ROOT_DEVICE_BUFFER_MIGRATION");
if (EnvStr)
return (std::stoi(EnvStr) != 0);
// The default is to migrate normally, which may not always be the
// best option (depends on buffer access patterns), but is an
// overall win on the set of the available benchmarks.
return true;
}();

// Peform actual device allocation as needed.
if (!Allocation.ZeHandle) {
if (Context->SingleRootDevice && Context->SingleRootDevice != Device) {
if (!SingleRootDeviceBufferMigration && Context->SingleRootDevice &&
Context->SingleRootDevice != Device) {
// If all devices in the context are sub-devices of the same device
// then we reuse root-device allocation by all sub-devices in the
// context.
Expand Down

0 comments on commit bd03e0d

Please sign in to comment.