Skip to content

Commit

Permalink
metal : switch to execution barriers + fix one of the barriers
Browse files Browse the repository at this point in the history
  • Loading branch information
ggerganov committed Dec 13, 2023
1 parent 109e7aa commit e1241d9
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions ggml-metal.metal
Original file line number Diff line number Diff line change
Expand Up @@ -385,8 +385,11 @@ kernel void kernel_soft_max(
pdst[i00] = exp_psrc0;
}

// This barrier fixes a failing test
// ref: https://github.com/ggerganov/ggml/pull/621#discussion_r1425156335
threadgroup_barrier(mem_flags::mem_none);

float sum = simd_sum(lsum);
threadgroup_barrier(mem_flags::mem_threadgroup);

if (ntg > N_SIMDWIDTH) {
if (sgitg == 0) {
Expand Down Expand Up @@ -470,9 +473,13 @@ kernel void kernel_soft_max_4(
}

const float lsum = lsum4[0] + lsum4[1] + lsum4[2] + lsum4[3];
threadgroup_barrier(mem_flags::mem_threadgroup);

// This barrier fixes a failing test
// ref: https://github.com/ggerganov/ggml/pull/621#discussion_r1425156335
threadgroup_barrier(mem_flags::mem_none);

float sum = simd_sum(lsum);

if (ntg > N_SIMDWIDTH) {
if (sgitg == 0) {
buf[tiisg] = 0.0f;
Expand Down

0 comments on commit e1241d9

Please sign in to comment.