-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.md.old
39 lines (22 loc) · 1.26 KB
/
README.md.old
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
DESCRIPTION
These are the benchmarks from the IEEE paper "Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations"
The paper is in the main directory (ieee2017gpustencil.pdf).
The package contains:
a. the GPU benchmarks generated by StencilGen
b. scripts for benchmarking on a Pascal device
DEPENDENCIES
We tested the framework on ubuntu 16.04 and Red Hat Enterprise Linux Server release 6.7 using a
Pascal P100 and Volta V100 card, with GCC 5.3.0, and NVCC 9.0. The following are hardware requirements
for the framework:
1. cmake >= 3.8
2. GCC >=4.9
3. NVCC 9.0
STEPS TO INSTALL
1. We present single- and double-precision benchmarks for Pascal P100. The single-precision benchmarks are in folder 'dsl-float', and the double-precision benchmarks are in folder 'dsl-double'.
2. Go to either 'dsl-float', or 'dsl-fouble', and
2a. set the paths to NVCC and NVPROF in setup.sh
2b. simply run './nvcc-run.sh' in the directory. This will invoke 'run.sh' script within each benchmark
3. In order to run the code for Volta, change the compute capability to 'sm_70' in each run.sh
We will simplify the scripts in some time.
MORE INFORMATION
For more information or questions, contact the authors at <rawat.15@osu.edu> or <vaidya.56@osu.edu>