Enhance the default scheduling mechanism of the CPU backend #3750
Labels
advanced optimization
The issue or bug is related to advanced optimization
discussion
Welcome discussion!
feature request
Suggest an idea on this project
welcome contribution
Concisely describe the proposed feature
Currently, the scheduling mechanism of the CPU backend is similar to
schedule(dynamic, chunk)
in OpenMP wherechunk
is set toblock_dim
in Taichi. Although users can manually specifyblock_dim
to get desired behavior, newcomers tend to rely on the default behavior of Taichi.Now
block_dim
has default value 32 in the CPU backend. However, just as exposed in #3734, it is not always a good choice towards performance. To avoid misleading users about the performance of Taichi, we hope to enhance the default behavior - adaptively determineblock_dim
as a heuristic function of the number of threads, the number of loop iterations, as well as the estimated workload of a single iteration.Discussions and contributions are welcome!
The text was updated successfully, but these errors were encountered: