-
Notifications
You must be signed in to change notification settings - Fork 370
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Details: - This is an "omnibus" commit, consisting of multiple medium-sized commits that affect non-trivial aspects of BLIS. The major highlights: - Relocated the pba, sba pool (from the rntm_t), and mem_t (from the cntl_t) to the thrinfo_t object. This allows the rntm_t to be effectively const (although it is sometimes copied internally and modified to reflect different ways of parallelism). Moving the mem_t sets the stage for sharing a global control tree amongst all threads. - De-templatized the macrokernels for gemmt, trmm, and trsm to match the macrokernel for gemm, which has been de-templatized since 54fa28b. - Reimplemented bli_l3_determine_kc() by separating out the logic for adjusting KC based on MR/NR for triangular A and/or B into a new function, bli_l3_adjust_kc(). For now, this function is still called from bli_l3_determine_kc(), but in the future we plan to have it called once when constructing the control tree. - Refactored the level-3 thread decorator into two parts: - One part deals only with launching threads, each one calling a generic thread entry function. This code resides in frame/thread and constitutes the definition of bli_thread_launch(). Note that it is specific to the threading implementation (OpenMP, pthreads, single, etc.) - The other part deals with passing the matrix operands and related information into bli_thread_launch(). This is the "l3 decorator" and now resides in frame/3. It is agnostic to the threading implementation. - Modified the "level" of the thread control tree passed in at each operation. Previously, each operation (e.g. bli_gemm_blk_var1()) was passed in a communicator representing the active thread teams which would share the available work. Now, the *parent* thread comm is passed in. The operation then grabs the child comm and uses it to partition the work. The difference is in bli_trsm_blk_var1(), where there are now two children nodes for this single operation (i.e. the thread control tree is split one level above where the control tree is). The sub-prenode is used for the trsm subproblem while the normal sub-node is used for the gemm part. Importantly, the parent comm is used for the barrier between them. - Removed cntl_t* arguments from bli_*_front() functions. These will be added back in the future when the control tree's creation is moved so that it happens much sooner (provided that bli_*_front() have not been absorbed into their respective bli_*_ex() functions). - Renamed various bli_thread_*() query functions to bli_thrinfo_*(), for consistency. This includes _num_threads(), _thread_id(), _n_way(), _work_id(), _sba_pool(), _pba(), _mem(), _barrier(), _broadcast(), and _am_chief(). - Removed extraneous barrier from _blk_var3() of gemm and trsm. - Fixed a typo in bli_type_defs.h where BLIS_BLAS_INT_TYPE_SIZE was misspelled.
- Loading branch information
1 parent
c803b03
commit aeb5f0c
Showing
206 changed files
with
5,013 additions
and
11,035 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.