Show effects of oversubscription and ways to fix that.
Install TBB and SMP modules for Python in order to evaluate composable multithreading:
conda install -c intel tbb4py smp
If not sure how to do this, just run set_python_envs.sh
to set up conda environment with all the necessary components of Intel Distribution for Python and follow instructions for environment activation, e.g. conda activate intel3
.
Effects are visible on big enough machine with 32 and more cores. Run following modes for example:
python collab_filt.py
python dask_bench2.py
python -m tbb collab_filt.py
python -m tbb dask_bench2.py
python -m smp collab_filt.py
python -m smp dask_bench2.py
There are the folloing composability modes for testing:
Enables TBB threading for MKL, Numpy, Dask, Python's multiprocessing.ThreadPool
Same as -m tbb
but also enables interprocess coordination for multiprocessing applications.
Statically allocates CPU resources between the nested parallel regions using affinity masks and OpenMP API. Supports both multithreading and multiprocessing parallelism.
Enables KMP_COMPOSABILITY=mode=counting
for Intel OpenMP runtime when parallel regions are ordered using a semaphore. Supports both multithreading and multiprocessing parallelism.
Paper: "Composable Multi-Threading and Multi-Processing for Numeric Libraries" by Anton Malakhov, David Liu, Anton Gorshkov, Terry Wilmarth. Proceedings of the 17th Python in Science Conference (SciPy 2018), Austin, Texas (July 9 - 15, 2018). DOI 10.25080/Majora-4af1f417-003