-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi thread crash #68
Comments
the problem is in bli_gemm_blk_var1f, for init the b_pack_s after init the buffer is 0, the following patch seems to work for me diff --git a/frame/base/bli_mem.c b/frame/base/bli_mem.c
@@ -243,7 +245,8 @@ siz_t bli_mem_pool_size( packbuf_t buf_type )
|
@songmaotian Thanks for this bug report. It appears that your patch disables use of the internal memory allocator in favor of a regular I noticed that in the code snippet you gave in your first post, you did not show any call to |
that's true, I disabled the mempool case. I have compare the result, it's right, so I didn't dig the problem deeper, sorry for that. |
@songmaotian I'll have to look at this more closely to figure out why the memory allocator is not thread-safe. (It is supposed to be thread-safe.) |
BLIS isonly threadsafe right now if you compile BLIS with the same threading model you use to multithread. @songmaotian https://github.com/songmaotian What threading model are you using and how are you compiling BLIS?
|
right now I am using version a30ccbc 38 #if 0 it seems I haven't enable multi threading in the lib (it's same for x86(auto) and armv8) |
Use something like './configure -t pthreads ' On April 26, 2016 7:11:22 PM CDT, songmaotian notifications@github.com wrote:
|
@fgvanzee thanks for that, thanks for all your great work, I have been so busy that I have no time to dig deeper, sorry for that |
@tlrmchlsmth @devinamatthews Thanks for chiming in, guys. |
@songmaotian I would like to close this issue. Could you confirm that enabling multithreading in BLIS solved your problem? |
I am using it on android, "./configure -t pthreads" can't be built since the pthread barrier isn't implemented on android. |
@songmaotian Pthread barrier is an extension that isn't supported on Mac either. This is one workaround. You might try it. Since copying random code from the Internet is a licensing risk, someone should write a clean-sheet implementation for BLIS. The other option is to write a barrier using GCC or C11 atomics. GCC atomics means the intrinsics that are supported by many compilers, including Clang, Intel, IBM, and Cray (partially - see my OpenPA fork for exceptions). However, such a barrier is likely to spin-wait, which will perform terribly in the presence of oversubscription, whereas the Pthread-based approaches are likely to not do this. |
We have a non-pthread barrier for OpenMP, the right #ifdef's just need
to get inserted. We should probably just use our own barrier all of the
time since I would not be surprised if the pthread one sucks.
|
@jedbrown has asserted that Pthread barrier sucks many times, and I believe him, at least in the presence of general purpose operating systems. One part of the definition that may interfere with performance is the following: A thread that has blocked on a barrier shall not prevent any unblocked thread that is eligible to use the same processing resources from eventually making forward progress in its execution. Eligibility for processing resources shall be determined by the scheduling policy. This forward progress requirement means that In theory, if one has an operating system that does not oversubscribe the hardware threads, forward progress may be satisfied without a context switch, but I'm not aware of e.g. Blue Gene optimizing for this. I assume that BLIS does not use an OpenMP barrier because it is both a memory and execution barrier, and you only need the latter. Is that correct? |
We use a spin-barrier. We can't use #omp barrier because we need to On 5/27/16 12:30 PM, Jeff Hammond wrote:
|
@jeffhammond Beyond oversubscription, pthread barrier's problem is that threads don't have identity; the barrier completes after any |
OK it sounds like we should just change it so that BLIS never uses pthread barriers unless (maybe) someone requests them specifically at configure time. Any objections? @fgvanzee? |
@tlrmchlsmth Without objection. On matters such as these, I defer to those with more expertise. |
+1 to no pthread barriers by default. They might be useful for functional Jeff Hammond |
OK there are no pthread barriers by default anymore. I merged in #81 while I was at it. |
AFAICT this is "fixed" now. |
Hello,
It seems libblis is not thread safe, I have a gemm invocation
and we have several thread will invoke the matmul, then it will crash as follow, the parameter p get changed to 0, if I change the program to run only one thread to invoke the matmul, it will be all right.
#0 0x0000000000545abc in bli_spackm_6xk_ref (conja=BLIS_NO_CONJUGATE, n=25, kappa=0x7fffec000dc0, a=0x7fffb8091540, inca=25, lda=1, p=0x0, ldp=6)
#1 0x00000000004ee461 in bli_spackm_cxk (conja=BLIS_NO_CONJUGATE, panel_dim=6, panel_len=25, kappa=0x7fffec000dc0, a=0x7fffb8091540, inca=25, lda=1, p=0x0, ldp=6,
#2 0x00000000004c4806 in bli_spackm_struc_cxk (strucc=BLIS_GENERAL, diagoffc=0, diagc=BLIS_NONUNIT_DIAG, uploc=BLIS_DENSE, conjc=BLIS_NO_CONJUGATE,
#3 0x00000000004bbbe6 in bli_spackm_blk_var1 (strucc=BLIS_GENERAL, diagoffc=0, diagc=BLIS_NONUNIT_DIAG, uploc=BLIS_DENSE, transc=BLIS_NO_TRANSPOSE,
#4 0x00000000004bb133 in bli_packm_blk_var1 (c=0x7fffd2479e50, p=0x7fffd24798e0, cntx=0x7fffd247c190, t=0x7fffb8007600) at frame/1m/packm/bli_packm_blk_var1.c:234
#5 0x00000000004aed11 in bli_packm_int (a=0x7fffd2479e50, p=0x7fffd24798e0, cntx=0x7fffd247c190, cntl=0x7fffec002c00, thread=0x7fffb8007600)
#6 0x00000000004b23ca in bli_gemm_blk_var1f (a=0x7fffd2479d80, b=0x7fffd2479e50, c=0x7fffd2479f20, cntx=0x7fffd247c190, cntl=0x7fffec002d00, thread=0x7fffb8007660)
#7 0x00000000004488b2 in bli_gemm_int (alpha=0x7c66a0 <BLIS_ONE>, a=0x7fffd247a160, b=0x7fffd247a230, beta=0x7c66a0 <BLIS_ONE>, c=0x7fffd247a090, cntx=0x7fffd247c190,
#8 0x00000000004b304b in bli_gemm_blk_var3f (a=0x7fffd247a530, b=0x7fffd247a600, c=0x7fffd247a6d0, cntx=0x7fffd247c190, cntl=0x7fffec002da0, thread=0x7fffb80c0ae0)
#9 0x00000000004488b2 in bli_gemm_int (alpha=0x7c66a0 <BLIS_ONE>, a=0x7fffd247a840, b=0x7fffd247a910, beta=0x7c66a0 <BLIS_ONE>, c=0x7fffd247a9e0, cntx=0x7fffd247c190,
#10 0x00000000004b2b28 in bli_gemm_blk_var2f (a=0x7fffd247ace0, b=0x7fffd247adb0, c=0x7fffd247ae80, cntx=0x7fffd247c190, cntl=0x7fffec002e40, thread=0x7fffb80c0ca0)
#11 0x00000000004488b2 in bli_gemm_int (alpha=0x7fffd247bcf0, a=0x7fffd247b0c0, b=0x7fffd247b190, beta=0x7fffd247bf60, c=0x7fffd247b260, cntx=0x7fffd247c190,
#12 0x0000000000423d60 in bli_level3_thread_decorator (n_threads=1, func=0x447f75 <bli_gemm_int>, alpha=0x7fffd247bcf0, a=0x7fffd247b0c0, b=0x7fffd247b190,
#13 0x0000000000447f5a in bli_gemm_front (alpha=0x7fffd247bcf0, a=0x7fffd247bdc0, b=0x7fffd247be90, beta=0x7fffd247bf60, c=0x7fffd247c030, cntx=0x7fffd247c190,
#14 0x0000000000429be5 in bli_gemmnat (alpha=0x7fffd247bcf0, a=0x7fffd247bdc0, b=0x7fffd247be90, beta=0x7fffd247bf60, c=0x7fffd247c030, cntx=0x7fffd247c190)
#15 0x000000000049242b in bli_gemmind (alpha=0x7fffd247bcf0, a=0x7fffd247bdc0, b=0x7fffd247be90, beta=0x7fffd247bf60, c=0x7fffd247c030, cntx=0x7fffd247c190)
#16 0x000000000044701e in bli_gemm_ex (alpha=0x7fffd247bcf0, a=0x7fffd247bdc0, b=0x7fffd247be90, beta=0x7fffd247bf60, c=0x7fffd247c030, cntx=0x7fffd247c190)
#17 0x0000000000419e3c in bli_sgemm (transa=BLIS_NO_TRANSPOSE, transb=BLIS_NO_TRANSPOSE, m=1936, n=16, k=25, alpha=0x7fffd247c188, a=0x7fffb8091540, rs_a=25, cs_a=1,
The text was updated successfully, but these errors were encountered: