Parallel rns fgemv #273

ZHG2017 · 2019-06-27T14:28:21Z

Implementation of fgemv for rns with corresponding helpers

…on into pfgemv which will no more require to be labeled with PAR_BLOCK

…gemv-mp has been restructured for different parameter values

…r benchmarking fgemv in the rns field

…ervable speedup for rns fgemv

…ble::reduce)

Breush · 2019-07-01T08:37:33Z

@ClementPernet Looks like all Travis builds are failing even on master. Have there been any updates in the configuration recently?

ClementPernet · 2019-07-26T09:32:17Z

configure.ac

@@ -158,7 +158,7 @@ AC_PROG_LIBTOOL
 AC_PROG_EGREP
 AC_PROG_SED
 # newer libtool...
-LT_PREREQ([2.4.3])
+LT_PREREQ([2.4.2])


You should not have to change this

I had to do so because I once got error with travis as if the libtool version was not corret. By changing the version to a lower number, I passed the travis compilation without the strange build failure.

ClementPernet · 2019-07-26T09:35:13Z

fflas-ffpack/fflas/fflas_bounds.inl

+                                {
+                                    for (size_t j=0; j<N; ++j) {
+                                        const Givaro::Integer & x(A[i*lda+j]);
+                                        if (Givaro::absCompare(x,vmax[i])>0){ vmax[i] = x;}


Updating vmax[i] in each parallel task creates contention on every operation. This code is likely not parallel at all.

Yes not so correctly parallelized but timing showed a few speedup. I cannot see any easier way to find the local max value for each thread then search for the global max value outside the parallel region.

ClementPernet · 2019-07-26T09:36:01Z

fflas-ffpack/fflas/fflas_fgemm.inl

@@ -410,7 +410,9 @@ namespace FFLAS {
            else if (!std::is_same<Field,Givaro::ModularBalanced<float> >::value){
                if (F.characteristic() < DOUBLE_TO_FLOAT_CROSSOVER)
                    return Protected::fgemm_convert<Givaro::ModularBalanced<float>,Field>(F,ta,tb,m,n,k,alpha,A,lda,B,ldb,beta,C,ldc,H);
-                else if (!std::is_same<Field,Givaro::ModularBalanced<double> >::value && 16*F.cardinality() < Givaro::ModularBalanced<double>::maxCardinality())
+                else if (!std::is_same<Field,Givaro::ModularBalanced<double> >::value &&
+			 !std::is_same<Field,Givaro::ModularBalanced<double> >::value &&


why is this line duplicated?

If I remember well, we did this together to correct some compile time error, but I could not remember of the exact reason.

ZHG added 28 commits May 3, 2019 17:01

rns fgemv with parseqhelper adapted but still need to wrap the functi…

294f146

…on into pfgemv which will no more require to be labeled with PAR_BLOCK

rns fgemv with parseqhelper adapted and its corresponding benchmark-f…

545645e

…gemv-mp has been restructured for different parameter values

cleaned up for code review

3d38bed

Rolled back the benchmark-fgemv-mp and adopted benchmark-fgemv-rns fo…

3663c83

…r benchmarking fgemv in the rns field

Ready for rns benchmark

f632d38

Instant backup for code review

27dd46c

Check if it is required impl

c8928bb

rns for fgemv implemented but no obvious speedup can be found

59326f1

Having used SYNCH_GROUP to avoid segmmentation fault but still no obs…

7bc53b9

…ervable speedup for rns fgemv

add benchmark-fgemv-rns

93cda61

Instant backup before code review

d194b9a

detailed timing

3bfb919

Instant backup before code review

5913444

Use Givaro::absCompare in InfNorm() to compute bound on the output

ba60d52

Ready for benchmarks on a server

4db380d

Merge branch 'master' into parallel-rns-fgemv

636a6d8

Got ready for benchmark on a server

3b12ce7

Fallback to PARFOR1D in the rsn-double::init

14a3dea

Instant backup before code review

45eb9fa

Further improved to validate on hpac

34a1277

Check if freduce is speeded up

58d0706

Corrected the wrong parallelization of freduce

63f157b

Instant backup for code review

e601bb4

Merge branch 'master' into parallel-rns-fgemv

83ce47e

Fixed templated fgemv confusion for Givaro::Modular<RecInt::ruint<K>>

f1db3bc

Used FOR1D instead of FOR1DBLOCK for both InfNorm and freduce(rns_dou…

af2f7e1

…ble::reduce)

Cleaned up for code review before PR

10c90d9

Corrected bug in InfNorm by falling back to FOR1DBLOCK

02d1d85

ZHG2017 requested a review from ClementPernet June 27, 2019 14:28

ZHG added 8 commits July 1, 2019 15:06

Trying to add test for rns fgemv

ba96e19

Updated test-fgemv-rns

8f17bd4

Merge branch 'master' into parallel-rns-fgemv

418b010

Fall back to 3.3.3 version for the libtool in the configure.ac

18d2d14

Merged with Master

cd29283

Chnaged LT_PREREQ to 2.4.2

6808a05

Chnaged back LT_PREREQ to 2.4.3

03f1430

Corrected Makefile.am typo error for test-fgemv-rns

a30ca0b

ClementPernet reviewed Jul 26, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel rns fgemv #273

Parallel rns fgemv #273

ZHG2017 commented Jun 27, 2019

Breush commented Jul 1, 2019

ClementPernet Jul 26, 2019

ZHG2017 Jul 26, 2019

ClementPernet Jul 26, 2019

ZHG2017 Jul 26, 2019

ClementPernet Jul 26, 2019

ZHG2017 Jul 26, 2019

Parallel rns fgemv #273

Are you sure you want to change the base?

Parallel rns fgemv #273

Conversation

ZHG2017 commented Jun 27, 2019

Breush commented Jul 1, 2019

ClementPernet Jul 26, 2019

Choose a reason for hiding this comment

ZHG2017 Jul 26, 2019

Choose a reason for hiding this comment

ClementPernet Jul 26, 2019

Choose a reason for hiding this comment

ZHG2017 Jul 26, 2019

Choose a reason for hiding this comment

ClementPernet Jul 26, 2019

Choose a reason for hiding this comment

ZHG2017 Jul 26, 2019

Choose a reason for hiding this comment