Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed subtle barrier_fpa bug in bli_thrcomm.c. #690

Merged
merged 1 commit into from
Nov 11, 2022
Merged

Conversation

fgvanzee
Copy link
Member

Details:

  • In bli_thrcomm.c, correctly initialize the BLIS_OPENMP element of the barrier function pointer array (barrier_fpa) to NULL when BLIS_ENABLE_OPENMP is not defined. Similarly, initialize the BLIS_POSIX element of barrier_fpa to NULL when BLIS_ENABLE_PTHREADS is not defined. This bug was introduced in a1a5a9b and was likely the result of an incomplete edit. The effects of the bug would have likely manifested when querying a thrcomm_t that was initialized with a timpl_t value corresponding to a threading implementation that was omitted from the -t option at configure-time.

Details:
- In bli_thrcommo.c, correctly initialize the BLIS_OPENMP element of the
  barrier function pointer array (barrier_fpa) to NULL when
  BLIS_ENABLE_OPENMP is *not* defined. Similarly, initialize the
  BLIS_POSIX element of barrier_fpa to NULL when BLIS_ENABLE_PTHREADS is
  not enabled. This bug was introduced in a1a5a9b and was likely the
  result of an incomplete edit. The effects of the bug would have
  likely manifested when querying a thrcomm_t that was initialized with
  a timpl_t value corresponding to a threading implementation that was
  omitted from the -t option at configure-time.
@devinamatthews
Copy link
Member

Thx I think I also fixed this similarly in the HPX PR.

@fgvanzee fgvanzee merged commit e1ea25d into master Nov 11, 2022
@fgvanzee fgvanzee deleted the barrier_fpa_bugfix branch November 11, 2022 18:07
fgvanzee added a commit that referenced this pull request May 20, 2024
Details:
- Fixed a bug in right-sided hemm when:
  - using the 1m method,
  - #defining BLIS_DISABLE_HEMM_RIGHT in the active subconfiguration,
    and
  - the storage of C matches the gemm microkernel IO preference PRIOR to
    the right-sidedness being detected and recast in terms of the left-
    side code path.
  It turns out that bli_gemm_ind_recast_1m_params() was applying its
  optimization (recasting a complex-domain macrokernel calling a 1m
  virtual microkernel to a real-domain macrokernel calling the real-
  domain microkernel) in situations in which it should not have. The
  optimization was silently assuming that the storage of C always
  matched that of the microkernel preference, since the front-end (in
  this case, bli_hemm_front()) would have already had a chance to
  transpose the operation to bring the two into agreement. However, by
  disabling right-sided hemm, we deprive BLIS of that flexibility (as a
  transposed left-sided case would necessarily have to become a right-
  sided case), and thus the assumption was no longer holding in all
  cases. Thanks to Nisanth M P for reporting this bug in Issue #621.
- The aforementioned bug, and its bugfix, also apply to symm when
  BLIS_DISABLE_SYMM_RIGHT is defined.
- Comment updates.
- CREDITS file update.
- (cherry picked from commit 3accacf)

Fixed perf of mt sup with packing, and mt gemmlike. (#696)

Details:
- Brought the gemmsup code path up to date relative to the latest
  thrinfo_t semantics introduced in the October Omnibus commit
  (aeb5f0c). This was done by passing the prenode (instead of the
  current node) into the packm variant within bli_l3_sup_packm.c as well
  as creating the prenodes and attaching them to the thrinfo_t tree in
  bli_l3_sup_thrinfo_create(). These changes erase the performance
  degradation introduced in the omnibus when running multithreaded sup
  with optional packing enabled. Special thanks to Devin Matthews for
  sussing out this fix in short order.
- Fixed the gemmlike sandbox in a manner similar to that of sup with
  packing, described above. This also involved passing the prenode into
  the local gemmlike packm variant. (Recall that gemmlike recycles the
  use of bli_l3_sup_thrinfo_create(), so it automatically inherits that
  part of the sup fix described above.)
- Updated bls_l3_packm_var[123].c to use bli_thrinfo_n_way() and
  bli_thrinfo_work_id() instead of bli_thrinfo_num_threads() and
  bli_thrinfo_thread_id(), respectively.
- (cherry picked from 4833ba2)

Fixed _gemm_small() prototype; disabled gemm_small.

Details:
- Fixed a mismatch between the prototype for bli_gemm_small() in
  bli_gemm_front.h and the actual definition of bli_gemm_small() in
  kernels/zen/3/bli_gemm_small.c. The former was erroneously declaring
  the cntl_t* argument as 'const'. Thanks to Jeff Diamond for reporting
  this issue.
- Commented out BLIS_ENABLE_SMALL_MATRIX, BLIS_ENABLE_SMALL_MATRIX_TRSM
  macro definitions in config/zen3/bli_family_zen3.h. AMD's small matrix
  implementation should probably remain disabled in vanilla BLIS, at
  least for now.
- (cherry picked from db10dd8)

Trival whitespace/comment tweaks.

Details:
- Trivial whitespace and comment changes, most of which ideally would
  have been part of the previous commit pertaining to HPX (2b05948).
- (cherry picked from f0337b7)

blis support for hpx (#682)

- Implement threading backend via HPX.
- HPX is an asynchronous many task runtime system used in high
  performance computing applications. The runtime implements the
  ISO C++ parallelism specification and provides a user-space
  thread implementation.
- This PR provides BLIS a thread backend implementation using HPX
  and resolves feature request #681. The configuration script,
  makefiles, and testsuite have been updated to support an HPX
  build option. The addition of HPX support provides other
  developers an exemplar for integrating other C++ threading
  backends into BLIS.
- (cherry picked from 2b05948)

Fixed subtle barrier_fpa bug in bli_thrcomm.c. (#690)

Details:
- In bli_thrcommo.c, correctly initialize the BLIS_OPENMP element of the
  barrier function pointer array (barrier_fpa) to NULL when
  BLIS_ENABLE_OPENMP is *not* defined. Similarly, initialize the
  BLIS_POSIX element of barrier_fpa to NULL when BLIS_ENABLE_PTHREADS is
  not enabled. This bug was introduced in a1a5a9b and was likely the
  result of an incomplete edit. The effects of the bug would have
  likely manifested when querying a thrcomm_t that was initialized with
  a timpl_t value corresponding to a threading implementation that was
  omitted from the -t option at configure-time.
- (cherry picked from e1ea25d)

Enhance emacs formatting of C files to remove trailing whitespace and ensure
 a newline at the end of file
- (cherry picked from dc6e5f3)

Delete mpi_test garbage. (#689)

Details:
- tlrmchlsmth: "What even is this? No comments, no commit message, not
  used by anything. Trash."
- (cherry picked from 713d078)

Some decluttering of the top-level directory.

Details:
- Relocated 'mpi_test' directory to test/mpi_test.
- Relocated 'so_version' and 'version' files from top-level directory to
  'build' directory.
- Updated build/bump-version.sh script to accommodate relocation of
  'version' file to 'build' directory.
- Updated configure script to accommodate relocation of 'so_version'
  file to 'build' directory.
- Updated INSTALL file to replace pointers to blis-devel mailing list
  with a pointer to docs/Discord.md.
- Updated RELEASING file to contain a reminder to consider whether the
  so_version file should be updated prior to the release.
- (cherry picked from 8d813f7)

Fix typo in configure --help text. (#686)

Details:
- Fixed a misspelling in the --help description for the --int-size (-i)
  configure option.
- (cherry picked from 6774bf0)

Support --nosup, --sup configure options. (#684)

Details:
- Added --nosup and --sup as alternative ways of requesting that sup be
  disabled or enabled. These are analagous to --disable-sup-handling and
  --enable-sup-handling, respectively. (I got tired of typing out
  --disable-sup-handling and needed a shorthand notation.)
- Tweaked message output by configure when sup is enable/disabled for
  clarity and specificity.
- Whitespace changes.
- (cherry picked from edcc2f9)

Add mention of Wilkinson Prize to README.md. (#683)

Details:
- Added blurbs and links to Wilkinson Prize to README.md.
- Added mention of both Best Paper and Wilkinson Prizes to the top of
  README.md.
- Other minor tweaks.
- (cherry picked from 5eea6ad)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants