#9 Test refactoring #1021

mzuzek · 2021-06-15T16:19:37Z

Part of #9

Running all samples: random and market matrices.
Running multiple mode/alpha/beta variants for each sample.
Table/CSV output for easy spreadsheet processing.

Sample output:

no.     name    size    block   nnz     mode    alpha   beta    error   maxNorm crsTime crsAvg  crsGFlops       bcrsTime        bcrsAvg bcrsGFlops      ratio   remarks
10:1/14         rand-10 415520  10      20694400        N       1       0       0       21.5914 8.84966 0.00864225      2.39456 7.27528 0.00710477      2.91275 0.822098        good
10:2/14         ^       ^       ^       ^       T       1       0       1.42109e-14     20.5859 15.6083 0.0152425       1.35768 15.7031 0.015335        1.34948 1.00607 NOT_faster
10:3/14         ^       ^       ^       ^       N       -1      -1      0       20.1678 8.91246 0.00870358      2.37769 7.38538 0.00721229      2.86932 0.828658        good
10:4/14         ^       ^       ^       ^       T       3.14159 0.25    5.68434e-14     62.8597 14.7627 0.0144167       1.43545 15.2628 0.0149051       1.38841 1.03388 NOT_faster
10:5/14         ^       ^       ^       ^       N       0       0       0       0       0.0243548       2.3784e-05      870.098 0.0182034       1.77768e-05     1164.13 0.747426        good
10:6/14         ^       ^       ^       ^       N       0       1       0       0       0.124868        0.000121942     169.707 0.120228        0.00011741      176.257 0.962838        good
11:1/14         rand-11 457072  11      25040224        N       1       0       0       22.0194 10.4566 0.0102116       2.45215 8.22974 0.00803685      3.11567 0.787035        good
11:2/14         ^       ^       ^       ^       T       1       0       1.95399e-14     21.5445 17.8651 0.0174464       1.43526 18.1238 0.017699        1.41478 1.01448 NOT_faster
11:3/14         ^       ^       ^       ^       N       -1      -1      0       22.17   10.7214 0.0104701       2.3916  8.3023  0.00810771      3.08844 0.774369        good
11:4/14         ^       ^       ^       ^       T       3.14159 0.25    5.68434e-14     69.9826 17.7584 0.0173422       1.44389 18.7383 0.0182991       1.36838 1.05518 NOT_faster
11:5/14         ^       ^       ^       ^       N       0       0       0       0       0.0237583       2.32015e-05     1079.25 0.0236734       2.31186e-05     1083.12 0.996427        good
11:6/14         ^       ^       ^       ^       N       0       1       0       0       0.142616        0.000139274     179.791 0.135705        0.000132524     188.949 0.951536        good
12:1/14         rand-12 498624  12      29799936        N       1       0       0       23.4407 12.6426 0.0123462       2.41368 9.88527 0.00965358      3.08693 0.781904        good
12:2/14         ^       ^       ^       ^       T       1       0       2.13163e-14     25.2746 21.667  0.0211592       1.40837 22.2227 0.0217019       1.37315 1.02565 NOT_faster
12:3/14         ^       ^       ^       ^       N       -1      -1      0       24.0409 13.365  0.0130518       2.28321 10.0778 0.00984157      3.02797 0.754041        good
12:4/14         ^       ^       ^       ^       T       3.14159 0.25    5.68434e-14     74.2295 21.2639 0.0207655       1.43507 21.4885 0.0209848       1.42007 1.01056 NOT_faster
12:5/14         ^       ^       ^       ^       N       0       0       0       0       0.0210595       2.05659e-05     1449    0.0429506       4.19439e-05     710.471 2.03949 NOT_faster
12:6/14         ^       ^       ^       ^       N       0       1       0       0       0.148766        0.00014528      205.121 0.145052        0.000141652     210.374 0.975029        good
13:1/14         GT01R.mtx       7980    5       430909  N       1       0       0       36366.2 0.12108 4.03601e-05     10.6766 0.0826729       2.75576e-05     15.6366 0.682794        good
13:2/14         ^       ^       ^       ^       T       1       0       1.45519e-11     54679   1.59771 0.000532572     0.80911 1.30716 0.000435721     0.988956        0.818145        good
13:3/14         ^       ^       ^       ^       N       -1      -1      0       75688.9 0.127964        4.26547e-05     10.1023 0.38661 0.00012887      3.34375 3.02124 NOT_faster
13:4/14         ^       ^       ^       ^       T       3.14159 0.25    9.45874e-11     147812  1.18387 0.000394623     1.09195 1.28176 0.000427253     1.00856 1.08269 NOT_faster
13:5/14         ^       ^       ^       ^       N       0       0       0       0       0.0184308       6.1436e-06      70.1395 0.0080156       2.67187e-06     161.276 0.434902        good
13:6/14         ^       ^       ^       ^       N       0       1       0       0       0.0071068       2.36893e-06     181.9   0.0069288       2.3096e-06      186.573 0.974954        good

(cusparse/mkl)

MatrixMarket files include the number of columns, so use that to construct the CrsMatrix. Previously, would compute #cols as max entry plus 1, but this may not be correct if there are empty columns.

to reduce code size.

this is a proof of concept that works of the combination of Serial/OpenMP and Common tests, it could be expended further but we should review this before more time is invested down that road...

With this approach we reduce the complexity of the CMake logic. We also remove all the backend .cpp files but one. Two downsides exit with this approach: 1. all new tests need to be explicitely added to Test_Component.hpp 2. since all the header are gathered in a unique source file developers need to be more careful about redifining functions and classes between tests On the other hand this should also promote the creation of utility headers to gather implementation of common features useful for multiple tests instead of copy pasting code which makes maintenance harder.

… backends Using the new infrastructure style for all backends to make the code cleaner and remove a load of .cpp files. Changes were needed to implement more logic avoiding to run some test on the Cuda backend when UVM is not enabled.

and uniformly adding pre-processor macro naming in each file. After this new round of changes the unit-tests are more uniform and are responsible to be enabled only when configuration and compilation unit allow them to compile and run correctly. Also made some changes regarding the vanillaGEMM stuff to avoid future issues if someone reorders tests around before we make further changes to use the sharedVanillaGEMM defined in test_common/KokkosKernels_TestUtils.hpp

This is the last piece of the unit-tests refactor. Compiled and tested on Tulip, this works just fine.

Multiple tests define the exact same macro, it was not an issue so far since the definition was only included in a single cpp. To avoid the problem now that all the hpp are included in the same cpp, we simply need to undefine the macro at the end of the test header: "#undef EXECUTE_TEST"

Re-introducing necessary guard in gesv test, now the CUDA CI test should compile correctly.

CPUs always use RangePolicy, GPUs always use TeamPolicy. With a runtime branch, both paths are instantiated even though only one is ever taken. Replace with enable_if.

Very close to cublas in performance, with LayoutLeft or LayoutRight. (~100 gflops for 1e5 * 1e5 double matrix on V100)

Counting flops that way is consistent with the rest of Kokkos Kernels.

Fixup HIP nightly builds

Prevent redundant spmv kernel instantiations (reduce library size)

unit-test: refactor infrastructure to remove most *.cpp

This only turns on ETI not tests yet, next step will be to add support in utility functions and macros to chose betweem host and device code branches.

rebase Nothing to big but adding obviously missing ENDIF() controls that got removed during a rebase...

Adding the openmptarget folder that contains the source files for the backend's unit-tests. Ignoring Vector batched tests that only run on host. Also commenting out the batched tests as they do not build correctly at the moment.

Replacing the call to the offending function with the new function call that is not from the Impl namespace.

Deprecation: a deprecated function is called in the SpADD perf_test

Remove unused variable ORDINAL_MAX

This was silently succeeding on Volta but failed on Pascal. Running under cuda-memcheck on Volta showed the issue.

… too To have Kokkos Kernels in line with Kokkos flags we should be able to build without unused-parameter warnings. These changes are making are making this possible using clang-12. Maybe more testing on other platform should be performed with this warning turned on.

This was confusing Github syntax highlighting

(Fixing kokkos#960)

Removes "error: implicit capture of ‘this’ via ‘[=]’ is deprecated in C++20"

I just ran clang-format on the badly formatted files. I removed function arguments from `main(int, argv**)` as they are really not justified and also added space between pointers and comments to make github happy.

Updating the list of warnings for each compilers in the test scripts. This should trigger the warnings in the auto-tester. This has been tested on Blake manually already but this change will trigger the testing on White and Kokkos-dev2 automatically.

Fix -Werror=deprecated errors with c++20 standard

This finishes to fix the warnings related to unused parameters that were in CUDA specific code branches.

Small update as parameter is commented it should not be refered to in the non-CUDA path anymore...

HIP: enabling all unit tests

Warnings: remove -Wunused-parameter warnings in Kokkos Kernels

- Handle beta zero vs nonzero branch consistently in gemv - Move "vanilla GEMV" out of test and into TestUtils, since vanilla GEMM is there

Fix invalid mem accesses in new GEMV kernel

Make CRS sorting utils work with unmanaged

…tion in new file. Add example.

Create new interface for spmv

dalg24 and others added 30 commits March 9, 2021 14:42

Configure with -DCMAKE_CXX_EXTENSIONS=OFF in HIP nightly builds

43c3e1d

spadd tpl driver for performance testing

d715f4e

(cusparse/mkl)

WIP: speeding up SpAdd

cb2004a

Make read_kokkos_crst_matrix get columns when known

0ef1c29

MatrixMarket files include the number of columns, so use that to construct the CrsMatrix. Previously, would compute #cols as max entry plus 1, but this may not be correct if there are empty columns.

SpAdd: inst TeamPol SortedCountEntries for GPU only

d33691b

to reduce code size.

Updated spadd perf test help text

0c8d300

spadd perftest: don't allow outputting c with mkl

1ad5273

Fix unused var warnings

98f2aab

Improve TPL error checking (spadd perftest)

f8917fa

Fix mkl error checking

f53f18b

unit-test: automatically generate Test_Serial_Common_*.cpp

aa1f7b7

this is a proof of concept that works of the combination of Serial/OpenMP and Common tests, it could be expended further but we should review this before more time is invested down that road...

Fixing issues with name clashing in batched unit-tests

8b21749

unit-tests: fix logic to avoid including tests that need UVM

24bf352

unit-test: re-writing tests for HIP backend using new framework

5316907

This is the last piece of the unit-tests refactor. Compiled and tested on Tulip, this works just fine.

unit-test: turning back off some HIP tests after testing them

038dd44

unit-tests: add pre-processor guard in blas gesv tests for TPL MAGMA

c9e6b21

Re-introducing necessary guard in gesv test, now the CUDA CI test should compile correctly.

Prevent redundant spmv kernel instantiations

c2b7bc6

CPUs always use RangePolicy, GPUs always use TeamPolicy. With a runtime branch, both paths are instantiated even though only one is ever taken. Replace with enable_if.

Added recoloring changes from Zoltan2's distributed coloring

a89832c

Add fast two-level mode N GEMV

a8e4ef5

Very close to cublas in performance, with LayoutLeft or LayoutRight. (~100 gflops for 1e5 * 1e5 double matrix on V100)

Replace View[] with View(), count madd as 2 flops

48664f4

Counting flops that way is consistent with the rest of Kokkos Kernels.

Merge pull request kokkos#907 from dalg24/disable_cxx_extensions

80226e6

Fixup HIP nightly builds

Merge pull request kokkos#937 from brian-kelley/SpmvCodeReduction

3cd285e

Prevent redundant spmv kernel instantiations (reduce library size)

Merge pull request kokkos#906 from lucbv/unit_test_refactor

fc55227

unit-test: refactor infrastructure to remove most *.cpp

OpenMPTarget: adding ETI and CMake logic for OpenMPTarget backend

a2fc040

This only turns on ETI not tests yet, next step will be to add support in utility functions and macros to chose betweem host and device code branches.

openmp_target: fixing some CMake synthax that got scrambled during

5486f95

rebase Nothing to big but adding obviously missing ENDIF() controls that got removed during a rebase...

unit-tests: adding logic specific to openmptarget backend

c060a70

Adding the openmptarget folder that contains the source files for the backend's unit-tests. Ignoring Vector batched tests that only run on host. Also commenting out the batched tests as they do not build correctly at the moment.

lucbv and others added 26 commits May 5, 2021 09:05

deprecation: a deprecated function is called in the SpADD perf_test

077faae

Replacing the call to the offending function with the new function call that is not from the Impl namespace.

Merge pull request kokkos#954 from lucbv/Fix_deprecated_call

24764eb

Deprecation: a deprecated function is called in the SpADD perf_test

Remove unused variable ORDINAL_MAX

1f1acb5

Merge pull request kokkos#955 from ndellingwood/fix-werror

74c71c5

Remove unused variable ORDINAL_MAX

Fix invalid mem accesses in new GEMV kernel

4c3c090

This was silently succeeding on Volta but failed on Pascal. Running under cuda-memcheck on Volta showed the issue.

Add a space between pointer * and C-style comment

710d66d

This was confusing Github syntax highlighting

Make CRS sorting utils work with unmanaged

15315a7

(Fixing kokkos#960)

Fix -Werror=deprecated errors with c++20 standard

6ee0205

Removes "error: implicit capture of ‘this’ via ‘[=]’ is deprecated in C++20"

Warnings: fixing spaces and indentation

b2b2ead

I just ran clang-format on the badly formatted files. I removed function arguments from `main(int, argv**)` as they are really not justified and also added space between pointers and comments to make github happy.

Merge pull request kokkos#964 from ndellingwood/fix-cpp20

1ac152b

Fix -Werror=deprecated errors with c++20 standard

Fix gemv beta=0: overwrite NaNs

3b01af5

Warnings: fix unused parameters for CUDA only code branches

eeb2d79

This finishes to fix the warnings related to unused parameters that were in CUDA specific code branches.

Warnings: fixing a small problem introduced by previous commit

464eaf2

Small update as parameter is commented it should not be refered to in the non-CUDA path anymore...

HIP: enabling tests to check behavior with latest software stack

2a8ed48

Merge pull request kokkos#968 from lucbv/HIP_enable_tests

d5adc57

HIP: enabling all unit tests

Merge pull request kokkos#962 from lucbv/remove_unused_parameters

6239511

Warnings: remove -Wunused-parameter warnings in Kokkos Kernels

Cleanup/consistency for GEMV and its test

2fdca40

- Handle beta zero vs nonzero branch consistently in gemv - Move "vanilla GEMV" out of test and into TestUtils, since vanilla GEMM is there

Merge pull request kokkos#961 from brian-kelley/FixGEMV

40a1bcc

Fix invalid mem accesses in new GEMV kernel

Merge pull request kokkos#963 from brian-kelley/FixSortingUnmanaged

ebbba8e

Make CRS sorting utils work with unmanaged

Create new interface for spmv and BlockCrsMatrix. Separate implementa…

97d29ae

…tion in new file. Add example.

Merge pull request #13 from fnrizzi/nga/10-interface-spmv

eca5398

Create new interface for spmv

Disabled failing market samples

fc81a91

#9 Tests refactored + CSV output

a551148

#9 Add alpha/beta test cases

a9a92a4

mzuzek closed this Jun 15, 2021

mzuzek deleted the nga/9-test-refactoring branch June 15, 2021 16:21

mzuzek restored the nga/9-test-refactoring branch June 15, 2021 16:24

kokkos-devops-admin mentioned this pull request May 22, 2024

Fix no return warnings #2203

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#9 Test refactoring #1021

#9 Test refactoring #1021

mzuzek commented Jun 15, 2021

#9 Test refactoring #1021

#9 Test refactoring #1021

Conversation

mzuzek commented Jun 15, 2021