diff --git a/v0.7.4/index.html b/v0.7.4/index.html index d1eb0da..1509788 100644 --- a/v0.7.4/index.html +++ b/v0.7.4/index.html @@ -1,4 +1,4 @@ Home · Polyester.jl

Polyester

Polyester.@batchMacro
@batch for i in Iter; ...; end

Evaluate the loop on multiple threads.

@batch minbatch=N for i in Iter; ...; end

Create a thread-local storage used in the loop.

@batch threadlocal=init() for i in Iter; ...; end

The init function will be called at the start at each thread. threadlocal will refer to storage local for the thread. At the end of the loop, a threadlocal vector containing all the thread-local values will be available. A type can be specified with threadlocal=init()::Type.

Evaluate at least N iterations per thread. Will use at most length(Iter) ÷ N threads.

@batch per=core for i in Iter; ...; end
 @batch per=thread for i in Iter; ...; end

Use at most 1 thread per physical core, or 1 thread per CPU thread, respectively. One thread per core will mean less threads competing for the cache, while (for example) if there are two hardware threads per physical core, then using each thread means that there are two independent instruction streams feeding the CPU's execution units. When one of these streams isn't enough to make the most of out of order execution, this could increase total throughput.

Which performs better will depend on the workload, so if you're not sure it may be worth benchmarking both.

LoopVectorization.jl currently only uses up to 1 thread per physical core. Because there is some overhead to switching the number of threads used, per=core is @batch's default, so that Polyester.@batch and LoopVectorization.@tturbo work well together by default.

Threads are not pinned to a given CPU core and the total number of available threads is still governed by --threads or JULIA_NUM_THREADS.

You can pass both per=(core/thread) and minbatch=N options at the same time, e.g.

@batch per=thread minbatch=2000 for i in Iter; ...; end
-@batch minbatch=5000 per=core   for i in Iter; ...; end
source
+@batch minbatch=5000 per=core for i in Iter; ...; endsource diff --git a/v0.7.4/search/index.html b/v0.7.4/search/index.html index bd9fb05..b8b400d 100644 --- a/v0.7.4/search/index.html +++ b/v0.7.4/search/index.html @@ -1,2 +1,2 @@ -Search · Polyester.jl

Loading search...

    +Search · Polyester.jl

    Loading search...