Skip to content

Commit

Permalink
Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex in…
Browse files Browse the repository at this point in the history
…to develop
  • Loading branch information
khuck committed Aug 5, 2021
2 parents 88dbe49 + 0ca7285 commit 25313ca
Show file tree
Hide file tree
Showing 15 changed files with 217 additions and 95 deletions.
22 changes: 16 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,18 +51,18 @@ APEX provides a mechanism for dynamic runtime behavior, either for autotuning or
Documentation
=============

Full user documentation is available here: http://khuck.github.io/xpress-apex.
Full user documentation is available here: http://uo-oaciss.github.io/apex.

The source code is instrumented with Doxygen comments, and the API reference manual can be generated by executing `make doc` in the build directory, after CMake configuration. [A fairly recent version of the API reference documentation is also available here] (http://www.nic.uoregon.edu/~khuck/apex_docs/doc/html/index.html).

Installation
============

[Full installation documentation is available here] (http://khuck.github.io/xpress-apex). Below is a quickstart for the impatient...
[Full installation documentation is available here] (http://uo-oaciss.github.io/apex). Below is a quickstart for the impatient...

Please Note:
------------
*These instructions are for building the stand-alone APEX library. For instructions on building APEX with HPX, please see [http://khuck.github.io/xpress-apex/usage](http://khuck.github.io/xpress-apex/usage)*
*These instructions are for building the stand-alone APEX library. For instructions on building APEX with HPX, please see [http://uo-oaciss.github.io/apex/usage](http://uo-oaciss.github.io/apex/usage)*


To build APEX stand-alone (to use with OpenMP, OpenACC, CUDA, Kokkos, TBB, C++ threads, etc.) do the following:
Expand Down Expand Up @@ -141,6 +141,11 @@ HPX5 (Indiana University)

HPX-5 (High Performance ParalleX) is a second implementation of the ParalleX model. Developed and maintained by the CREST Group at Indiana University, HPX-5 is implemented in C. For more information, see [https://hpx.crest.iu.edu](https://hpx.crest.iu.edu).

Pthreads / C++ Threads
----------------------

POSIX.1 specifies a set of interfaces (functions, header files) for threaded programming commonly known as POSIX threads, or Pthreads. A single process can contain multiple threads, all of which are executing the same program. These threads share the same global memory (data and heap segments), but each thread has its own stack (automatic variables). C++ threads are a language portable abstraction on top of native threading implementations. APEX supports pthreads by wrapping and capturing the `pthread_create` function call. For more information, see [https://man7.org/linux/man-pages/man7/pthreads.7.html](https://man7.org/linux/man-pages/man7/pthreads.7.html) and [https://www.cplusplus.com/reference/thread/thread/](https://www.cplusplus.com/reference/thread/thread/).

OpenMP
------

Expand All @@ -149,17 +154,22 @@ The OpenMP API supports multi-platform shared-memory parallel programming in C/C
OpenACC
-------

OpenACC is a user-driven directive-based performance-portable parallel programming model. It is designed for scientists and engineers interested in porting their codes to a wide-variety of heterogeneous HPC hardware platforms and architectures with significantly less programming effort than required with a low-level model. The OpenACC specification supports C, C++, Fortran programming languages and multiple hardware architectures including X86 & POWER CPUs, and NVIDIA GPUs.
OpenACC is a user-driven directive-based performance-portable parallel programming model. It is designed for scientists and engineers interested in porting their codes to a wide-variety of heterogeneous HPC hardware platforms and architectures with significantly less programming effort than required with a low-level model. The OpenACC specification supports C, C++, Fortran programming languages and multiple hardware architectures including X86 & POWER CPUs, and NVIDIA GPUs. For more information, see [https://www.openacc.org](https://www.openacc.org).

Kokkos
------

Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources. It currently can use CUDA, HPX, OpenMP and Pthreads as backend programming models with several other backends in development.
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources. It currently can use CUDA, HIP, HPX, OpenMP and Pthreads as backend programming models with several other backends in development. For more information, see [https://kokkos.org](https://kokkos.org).

CUDA
----

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs. APEX uses the CUPTI and NVML libraries provided by NVIDIA to gather performance information from the GPUs. For more information, see [https://developer.nvidia.com/cupti](https://developer.nvidia.com/cupti) and [https://developer.nvidia.com/nvidia-management-library-nvml](https://developer.nvidia.com/nvidia-management-library-nvml).

HIP/ROCm
--------

Heterogeneous-Computing Interface for Portability (HIP) is a C++ dialect from AMD designed to ease conversion of CUDA applications to portable C++ code. It provides a C-style API and a C++ kernel language. The C++ interface can use templates and classes across the host/kernel boundary. APEX uses the roctracer library to gather performance information from the GPUs. For more information, see [https://github.com/ROCm-Developer-Tools/roctracer](https://github.com/ROCm-Developer-Tools/roctracer).

References
==========
Expand Down
10 changes: 5 additions & 5 deletions cmake/Modules/FindACTIVEHARMONY.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ mark_as_advanced(ACTIVEHARMONY_INCLUDE_DIR ACTIVEHARMONY_LIBRARY)

# --------- DOWNLOAD AND BUILD THE EXTERNAL PROJECT! ------------ #
if((APEX_BUILD_ACTIVEHARMONY OR (NOT ACTIVEHARMONY_FOUND)) AND NOT APPLE)
set(CACHE ACTIVEHARMONY_ROOT ${CMAKE_INSTALL_PREFIX} STRING "Active Harmony Root directory")
set(CACHE ACTIVEHARMONY_ROOT ${CMAKE_INSTALL_PREFIX}/ah STRING "Active Harmony Root directory")
message("Attention: Downloading and Building ActiveHarmony as external project!")
message(INFO " A working internet connection is required!")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fPIC")
Expand All @@ -41,17 +41,17 @@ if((APEX_BUILD_ACTIVEHARMONY OR (NOT ACTIVEHARMONY_FOUND)) AND NOT APPLE)
PREFIX ${CMAKE_CURRENT_BINARY_DIR}/activeharmony-4.6.0
CONFIGURE_COMMAND ""
BUILD_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/activeharmony-4.6.0/src/project_activeharmony && make MPICC=mpicc_disabled CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_C_FLAGS}
INSTALL_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/activeharmony-4.6.0/src/project_activeharmony && make MPICC=mpicc_disabled CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_C_FLAGS} install prefix=${CMAKE_INSTALL_PREFIX}
INSTALL_DIR ${CMAKE_INSTALL_PREFIX}
INSTALL_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/activeharmony-4.6.0/src/project_activeharmony && make MPICC=mpicc_disabled CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_C_FLAGS} install prefix=${CMAKE_INSTALL_PREFIX}/ah
INSTALL_DIR ${CMAKE_INSTALL_PREFIX}/ah
LOG_DOWNLOAD 1
# LOG_CONFIGURE 1
# LOG_BUILD 1
# LOG_INSTALL 1
)
set(ACTIVEHARMONY_ROOT ${CMAKE_INSTALL_PREFIX})
set(ACTIVEHARMONY_ROOT ${CMAKE_INSTALL_PREFIX}/ah)
#ExternalProject_Get_Property(project_activeharmony install_dir)
add_library(harmony STATIC IMPORTED)
set_property(TARGET harmony PROPERTY IMPORTED_LOCATION ${CMAKE_INSTALL_PREFIX}/lib/libharmony.a)
set_property(TARGET harmony PROPERTY IMPORTED_LOCATION ${CMAKE_INSTALL_PREFIX}/ah/lib/libharmony.a)
set(ACTIVEHARMONY_INCLUDE_DIR "${ACTIVEHARMONY_ROOT}/include")
set(ACTIVEHARMONY_LIBRARY "${ACTIVEHARMONY_ROOT}/lib/libharmony.a")
# handle the QUIETLY and REQUIRED arguments and set ACTIVEHARMONY_FOUND to TRUE
Expand Down
20 changes: 13 additions & 7 deletions cmake/Modules/FindBFD.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ pkg_check_modules(PC_BFD QUIET BFD)
set(BFD_DEFINITIONS ${PC_BFD_CFLAGS_OTHER})

find_path(BFD_INCLUDE_DIR bfd.h
HINTS ${BFD_ROOT}/include
${PC_BFD_INCLUDEDIR}
HINTS ${BFD_ROOT}/include
${PC_BFD_INCLUDEDIR}
${PC_BFD_INCLUDE_DIRS}
PATH_SUFFIXES BFD )

Expand All @@ -29,8 +29,8 @@ if ($TMP_PATH)
endif()
find_library(BFD_LIBRARY NAMES bfd
HINTS ${BFD_ROOT}/lib ${BFD_ROOT}/lib64
${PC_BFD_LIBDIR}
${PC_BFD_LIBRARY_DIRS}
${PC_BFD_LIBDIR}
${PC_BFD_LIBRARY_DIRS}
${LD_LIBRARY_PATH_STR})

include(FindPackageHandleStandardArgs)
Expand All @@ -49,7 +49,7 @@ if((APEX_BUILD_BFD OR (NOT BFD_FOUND)) AND NOT APPLE)
ExternalProject_Add(project_binutils
URL "http://ftp.gnu.org/gnu/binutils/binutils-2.25.tar.bz2"
URL_HASH SHA256=22defc65cfa3ef2a3395faaea75d6331c6e62ea5dfacfed3e2ec17b08c882923
CONFIGURE_COMMAND <SOURCE_DIR>/configure CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_EXE_LINKER_FLAGS} --prefix=${CMAKE_INSTALL_PREFIX} --disable-dependency-tracking --enable-interwork --disable-multilib --enable-shared --enable-64-bit-bfd --target=${TARGET_ARCH} --enable-install-libiberty
CONFIGURE_COMMAND <SOURCE_DIR>/configure CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_EXE_LINKER_FLAGS} --prefix=${CMAKE_INSTALL_PREFIX}/binutils --disable-dependency-tracking --enable-interwork --disable-multilib --enable-shared --enable-64-bit-bfd --target=${TARGET_ARCH} --enable-install-libiberty
BUILD_COMMAND make MAKEINFO=true -j${MAKEJOBS}
INSTALL_COMMAND make MAKEINFO=true install
LOG_DOWNLOAD 1
Expand All @@ -59,11 +59,17 @@ if((APEX_BUILD_BFD OR (NOT BFD_FOUND)) AND NOT APPLE)
)
ExternalProject_Add_Step(project_binutils basedirs
DEPENDEES install
COMMAND cp <SOURCE_DIR>/include/demangle.h ${CMAKE_INSTALL_PREFIX}/include/.
COMMAND cp <SOURCE_DIR>/include/demangle.h ${CMAKE_INSTALL_PREFIX}/binutils/include/.
COMMENT "Copying additional headers"
)
ExternalProject_Add_Step(project_binutils basedirs2
DEPENDEES install
COMMAND cp <SOURCE_DIR>/include/demangle.h ${CMAKE_INSTALL_PREFIX}/binutils/include/.
COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_INSTALL_PREFIX}/binutils/lib ${CMAKE_INSTALL_PREFIX}/binutils/lib64
COMMENT "Adding lib64 simlink"
)

set(BFD_ROOT ${CMAKE_INSTALL_PREFIX})
set(BFD_ROOT ${CMAKE_INSTALL_PREFIX}/binutils)
ExternalProject_Get_Property(project_binutils install_dir)
add_library(bfd STATIC IMPORTED)
set_property(TARGET bfd PROPERTY IMPORTED_LOCATION ${install_dir}/lib/libbfd.so)
Expand Down
9 changes: 4 additions & 5 deletions cmake/Modules/FindOMPT.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ endif()

# --------- DOWNLOAD AND BUILD THE EXTERNAL PROJECT! ------------ #
if(APEX_BUILD_OMPT OR (NOT OMPT_FOUND))
set(CACHE OMPT_ROOT ${CMAKE_INSTALL_PREFIX} STRING "OMPT Root directory")
set(CACHE OMPT_ROOT ${CMAKE_INSTALL_PREFIX}/ompt STRING "OMPT Root directory")
message("Attention: Downloading and Building OMPT as external project!")
message(INFO " A working internet connection is required!")
include(ExternalProject)
Expand All @@ -75,19 +75,18 @@ if(APEX_BUILD_OMPT OR (NOT OMPT_FOUND))
#URL http://www.cs.uoregon.edu/research/paracomp/tau/tauprofile/dist/LLVM-openmp-2021-05-14.tar.gz
URL http://tau.uoregon.edu/LLVM-openmp-2021-05-14.tar.gz
PREFIX ${CMAKE_CURRENT_BINARY_DIR}/LLVM-ompt-5.0
CONFIGURE_COMMAND cmake -DCMAKE_C_COMPILER=${CMAKE_C_COMPILER} -DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER} -DCMAKE_INSTALL_PREFIX=${CMAKE_INSTALL_PREFIX} -DCMAKE_BUILD_TYPE=Release ${APEX_OMPT_EXTRA_CONFIG} ../project_ompt
CONFIGURE_COMMAND cmake -DCMAKE_C_COMPILER=${CMAKE_C_COMPILER} -DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER} -DCMAKE_INSTALL_PREFIX=${OMPT_ROOT} -DCMAKE_BUILD_TYPE=Release ${APEX_OMPT_EXTRA_CONFIG} ../project_ompt
BUILD_COMMAND make libomp-needed-headers all
INSTALL_COMMAND make install
INSTALL_DIR ${CMAKE_INSTALL_PREFIX}
INSTALL_DIR ${OMPT_ROOT}
LOG_DOWNLOAD 1
LOG_CONFIGURE 1
LOG_BUILD 1
LOG_INSTALL 1
)
set(OMPT_ROOT ${CMAKE_INSTALL_PREFIX})
#ExternalProject_Get_Property(project_ompt install_dir)
add_library(omp SHARED IMPORTED)
set_property(TARGET omp PROPERTY IMPORTED_LOCATION ${CMAKE_INSTALL_PREFIX}/lib/libomp.so)
set_property(TARGET omp PROPERTY IMPORTED_LOCATION ${OMPT_ROOT}/lib/libomp.so)
set(OMPT_INCLUDE_DIR "${OMPT_ROOT}/include")
set(OMPT_LIBRARY "${OMPT_ROOT}/lib/libomp.so")
# handle the QUIETLY and REQUIRED arguments and set OMPT_FOUND to TRUE
Expand Down
12 changes: 6 additions & 6 deletions cmake/Modules/FindOTF2.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -33,28 +33,28 @@ mark_as_advanced(OTF2_INCLUDE_DIR OTF2_LIBRARY)

# --------- DOWNLOAD AND BUILD THE EXTERNAL PROJECT! ------------ #
if(APEX_BUILD_OTF2 OR (NOT OTF2_FOUND))
set(CACHE OTF2_ROOT ${CMAKE_INSTALL_PREFIX} STRING "OTF2 Root directory")
set(CACHE OTF2_ROOT ${CMAKE_INSTALL_PREFIX}/otf2 STRING "OTF2 Root directory")
message("Attention: Downloading and Building OTF2 as external project!")
message(INFO " A working internet connection is required!")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fPIC")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fPIC")
include(ExternalProject)
ExternalProject_Add(project_otf2
URL http://www.vi-hps.org/upload/packages/otf2/otf2-2.0.tar.gz
URL https://www.vi-hps.org/cms/upload/packages/otf2/otf2-2.0.tar.gz
PREFIX ${CMAKE_CURRENT_BINARY_DIR}/otf2-2.0
CONFIGURE_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/otf2-2.0/src/project_otf2 && ./configure CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_EXE_LINKER_FLAGS} --prefix=${CMAKE_INSTALL_PREFIX} --enable-shared
CONFIGURE_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/otf2-2.0/src/project_otf2 && ./configure CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER} CFLAGS=${CMAKE_C_FLAGS} CXXFLAGS=${CMAKE_CXX_FLAGS} LDFLAGS=${CMAKE_EXE_LINKER_FLAGS} --prefix=${CMAKE_INSTALL_PREFIX}/otf2 --enable-shared
BUILD_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/otf2-2.0/src/project_otf2 && make
INSTALL_COMMAND cd ${CMAKE_CURRENT_BINARY_DIR}/otf2-2.0/src/project_otf2 && make install
INSTALL_DIR ${CMAKE_INSTALL_PREFIX}
INSTALL_DIR ${CMAKE_INSTALL_PREFIX}/otf2
LOG_DOWNLOAD 1
LOG_CONFIGURE 1
LOG_BUILD 1
LOG_INSTALL 1
)
set(OTF2_ROOT ${CMAKE_INSTALL_PREFIX})
set(OTF2_ROOT ${CMAKE_INSTALL_PREFIX}/otf2)
#ExternalProject_Get_Property(project_otf2 install_dir)
add_library(otf2 STATIC IMPORTED)
set_property(TARGET otf2 PROPERTY IMPORTED_LOCATION ${CMAKE_INSTALL_PREFIX}/lib/libotf2.a)
set_property(TARGET otf2 PROPERTY IMPORTED_LOCATION ${CMAKE_INSTALL_PREFIX}/otf2/lib/libotf2.a)
set(OTF2_INCLUDE_DIR "${OTF2_ROOT}/include")
set(OTF2_LIBRARY "${OTF2_ROOT}/lib/libotf2.a")
# handle the QUIETLY and REQUIRED arguments and set OTF2_FOUND to TRUE
Expand Down
2 changes: 1 addition & 1 deletion src/apex/Kokkos_Profiling_C_Interface.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
#include <stdbool.h>
#endif

#define KOKKOSP_INTERFACE_VERSION 20210225
#define KOKKOSP_INTERFACE_VERSION 20210623

// Profiling

Expand Down
9 changes: 9 additions & 0 deletions src/apex/apex_kokkos.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include <unordered_map>
#include <stdlib.h>
#include "apex.hpp"
#include "Kokkos_Profiling_C_Interface.h"

/*
static std::mutex memory_mtx;
Expand Down Expand Up @@ -74,6 +75,14 @@ void kokkosp_finalize_library() {
apex::finalize();
}

/* This is a new function to tell Kokkos to not fence */
void kokkosp_request_tool_settings(int num_actions,
struct Kokkos_Tools_ToolSettings *settings) {
if ((num_actions > 0) && (settings != nullptr)) {
settings->requires_global_fencing = apex::apex_options::use_kokkos_profiling_fences();
}
}

/* These functions are called before their respective parallel constructs
* execute (Kokkos::parallel_for, Kokkos::parallel_reduce,
* Kokkos::parallel_scan). The name argument is the name given by the user
Expand Down
Loading

0 comments on commit 25313ca

Please sign in to comment.