Skip to content

Commit

Permalink
Updating documentation for 2.6.4 release
Browse files Browse the repository at this point in the history
  • Loading branch information
khuck committed Feb 7, 2024
1 parent 146cacc commit ad9e12c
Show file tree
Hide file tree
Showing 14 changed files with 418 additions and 220 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ project (APEX CXX C)
set (APEX_DESCRIPTION "Autonomic Performance Environment for eXascale" CACHE STRING "APEX project description")
set (APEX_VERSION_MAJOR 2 CACHE STRING "APEX Major Version")
set (APEX_VERSION_MINOR 6 CACHE STRING "APEX Minor Version")
set (APEX_VERSION_PATCH 3 CACHE STRING "APEX Patch Version")
set (APEX_VERSION_PATCH 4 CACHE STRING "APEX Patch Version")
set (APEX_HOMEPAGE_URL "http://github.com/UO-OACISS/apex" CACHE STRING "APEX homepage URL")

cmake_policy(VERSION 2.8.12)
Expand Down
2 changes: 1 addition & 1 deletion cmake/Modules/APEX_DefaultOptions.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ option (APEX_WITH_OTF2 "Enable Open Trace Format 2 (OTF2) support" FALSE)
option (APEX_WITH_PAPI "Enable PAPI support" FALSE)
option (APEX_WITH_PERFETTO "Enable native Perfetto trace support" FALSE)
option (APEX_WITH_PHIPROF "Enable APEX PhiProf support" FALSE)
option (APEX_WITH_PLUGINS "Enable APEX policy plugin support" FALSE)
option (APEX_WITH_PLUGINS "Enable APEX policy plugin support" TRUE)
option (APEX_WITH_STARPU "Enable APEX StarPU support" FALSE)
option (APEX_WITH_TCMALLOC "Enable TCMalloc heap management" FALSE)
option (APEX_USE_PEDANTIC "Enable pedantic compiler flags" FALSE)
Expand Down
2 changes: 1 addition & 1 deletion doc/Doxyfile.in
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ PROJECT_NAME = "Autonomic Performance Environment for eXascale (APEX)"
# could be handy for archiving the generated documentation or if some version
# control system is used.

PROJECT_NUMBER = 2.6.3
PROJECT_NUMBER = 2.6.4

# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a
Expand Down
100 changes: 67 additions & 33 deletions doc/webdocs/docs/environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ APEX_VARIABLE2=value
```

To generate a default APEX configuration file in the current working directory, run the `./install/bin/apex_make_default_config` program.
To get a list of all known environment variables, run the `./install/bin/apex_environment_help` program.

| Environment Variable | Default Value | Valid Values | Description |
| -------------------- | -- | -- | -------------------------------- |
Expand Down Expand Up @@ -73,37 +74,70 @@ Usage:
apex_exec <APEX options> executable <executable options>
where APEX options are zero or more of:
--apex:help show this usage message
--apex:debug run with APEX in debugger
--apex:verbose enable verbose list of APEX environment variables
--apex:screen enable screen text output (on by default)
--apex:quiet disable screen text output
--apex:csv enable csv text output
--apex:tau enable tau profile output
--apex:taskgraph enable taskgraph output
(graphviz required for post-processing)
--apex:otf2 enable OTF2 trace output
--apex:otf2path specify location of OTF2 archive
(default: ./OTF2_archive)
--apex:otf2name specify name of OTF2 file (default: APEX)
--apex:gtrace enable Google Trace Events output
--apex:scatter enable scatterplot output
(python required for post-processing)
--apex:openacc enable OpenACC support
--apex:kokkos enable Kokkos support
--apex:raja enable RAJA support
--apex:pthread enable pthread wrapper support
--apex:untied enable tasks to migrate cores/OS threads
during execution (not compatible with trace output)
--apex:cuda_counters enable CUDA/CUPTI counter support
--apex:cuda_driver enable CUDA driver API callbacks
--apex:cuda_details enable per-kernel statistics where available
--apex:cpuinfo enable sampling of /proc/cpuinfo (Linux only)
--apex:meminfo enable sampling of /proc/meminfo (Linux only)
--apex:net enable sampling of /proc/net/dev (Linux only)
--apex:status enable sampling of /proc/self/status (Linux only)
--apex:io enable sampling of /proc/self/io (Linux only)
--apex:period specify frequency of OS/HW sampling
--apex:ompt_simple only enable OpenMP Tools required events
--apex:ompt_details enable all OpenMP Tools events
--apex:help show this usage message
--apex:debug run with APEX in debugger
--apex:verbose enable verbose list of APEX environment variables
--apex:screen enable screen text output (on by default)
--apex:screen-detail enable detailed text output (off by default)
--apex:quiet disable screen text output
--apex:final-output-only only output performance data at exit (ignore intermediate dump calls)
--apex:csv enable csv text output
--apex:tau enable tau profile output
--apex:taskgraph enable taskgraph output
(graphviz required for post-processing)
--apex:tasktree enable tasktree output
(python3 with Pandas required for post-processing)
--apex:hatchet enable Hatchet tasktree output
(python3 with Hatchet required for post-processing)
--apex:concur Periodically sample thread activity (default: off)
--apex:concur-max Max timers to track with concurrency activity (default: 5)
--apex:concur-period <value> Frequency of concurrency sampling, in microseconds
(default: 1000000)
--apex:throttle throttle short-lived timers to reduce overhead (default: off)
--apex:throttle-calls <value> minimum number of calls before throttling (default: 1000)
--apex:throttle-per <value> minimum timer duration in microseconds (default: 10)
--apex:otf2 enable OTF2 trace output (requries --apex:mpi with MPI configurations)
--apex:otf2path <value> specify location of OTF2 archive
(default: ./OTF2_archive)
--apex:otf2name <value> specify name of OTF2 file (default: APEX)
--apex:gtrace enable Google Trace Events output (deprecated)
--apex:pftrace enable Perfetto Trace output
--apex:scatter enable scatterplot output
(python required for post-processing)
--apex:openacc enable OpenACC support
--apex:kokkos enable Kokkos support
--apex:kokkos-tuning enable Kokkos runtime autotuning support
--apex:kokkos-fence enable Kokkos fences for async kernels
--apex:raja enable RAJA support
--apex:pthread enable pthread wrapper support
--apex:gpu-memory enable GPU memory wrapper support
--apex:cpu-memory enable CPU memory wrapper support
--apex:untied enable tasks to migrate cores/OS threads
during execution (not compatible with trace output)
--apex:cuda enable CUDA/CUPTI measurement (default: off)
--apex:cuda-counters enable CUDA/CUPTI counter support (default: off)
--apex:cuda-driver enable CUDA driver API callbacks (default: off)
--apex:cuda-details enable per-kernel statistics where available (default: off)
--apex:hip enable HIP/ROCTracer measurement (default: off)
--apex:hip-metrics enable HIP/ROCProfiler metric support (default: off)
--apex:hip-counters enable HIP/ROCTracer counter support (default: off)
--apex:hip-driver enable HIP/ROCTracer KSA driver API callbacks (default: off)
--apex:hip-details enable per-kernel statistics where available (default: off)
--apex:monitor-gpu enable GPU monitoring services (CUDA NVML, ROCm SMI)
--apex:level0 enable OneAPI Level0 measurement (default: off)
--apex:cpuinfo enable sampling of /proc/cpuinfo (Linux only)
--apex:meminfo enable sampling of /proc/meminfo (Linux only)
--apex:net enable sampling of /proc/net/dev (Linux only)
--apex:status enable sampling of /proc/self/status (Linux only)
--apex:io enable sampling of /proc/self/io (Linux only)
--apex:period <value> specify frequency of OS/HW sampling
--apex:mpi enable MPI profiling (required for OTF2 support with MPI configurations)
--apex:ompt enable OpenMP profiling (requires runtime support)
--apex:ompt-simple only enable OpenMP Tools required events
--apex:ompt-details enable all OpenMP Tools events
--apex:source resolve function, file and line info for address lookups with binutils
(default: function only)
--apex:preload <lib> extra libraries to load with LD_PRELOAD _before_ APEX libraries
(LD_PRELOAD value is added _after_ APEX libraries)
--apex:postprocess run post-process scripts (graphviz, python) on output data after exit
```
12 changes: 9 additions & 3 deletions doc/webdocs/docs/feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,14 +49,20 @@ APEX provides both *performance awareness* and *performance adaptation*.
* **Kokkos** - Using the Kokkos profiling interface, APEX can capture performance data related to Kokkos parallel abstractions.
* **RAJA** - Using the RAJA profiling interface, APEX can capture performance data related to RAJA parallel abstractions. Unlike Kokkos, RAJA doesn't give any details, so don't expect much.
* **CUDA** - Using the NVIDIA CUPTI and NVML libraries, APEX can capture runtime and driver API calls as well as memory transfers and kernels executed on a device, and monitor GPU utilization.
* **Distributed Execution over MPI** - While APEX doesn't measure MPI function calls, it is "MPI-aware", and can detect when used in a distributed run so that each process can write separate or aggregated performance data.
* **HIP** - Using the AMD Roctracer, Rocprofiler and ROCM-SMI libraries, APEX can capture runtime and driver API calls as well as memory transfers and kernels executed on a device, and monitor GPU utilization.
* **Intel SYCL** - Using the Intel Level0 libraries, APEX can capture runtime and driver API calls as well as memory transfers and kernels executed on a device, and monitor GPU utilization.
* **PhiProf** - APEX is integrated with support to intercept PhiProf profiling data. See <https://github.com/fmihpc/phiprof>.
* **StarPU** - APEX is integrated with support to profile StarPU. See <https://starpu.gitlabpages.inria.fr>.
* **Distributed Execution over MPI** - While APEX doesn't measure all MPI function calls, it is "MPI-aware", and can detect when used in a distributed run so that each process can write separate or aggregated performance data. APEX provides rudimentary support for measuring point-to-point and collectives.

## Parallel Models with Experimental Support / In Development / Wish List

* **Argobots** - APEX has been used to instrument services based on Argobots, but it is not integrated into the runtime.
* **TBB** - The APEX team is evaluating integrated TBB support.
* **Legion** - No plans at this time, but open to exploring it.
* **Legion** - No plans at this time.
* **Charm++** - No plans at this time.
* **Iris** - Plans are afoot. Stay tuned.
* **YAKL** - Plans are afoot. Stay tuned.

## Introspection

Expand Down Expand Up @@ -123,7 +129,7 @@ The **OTF2 listener** will construct a full event trace and write the events out

### Google Trace Event Listener

The **Trace Event listener** will construct a full event trace and write the events to one or more [Google Trace Event](https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit#) trace files. The files can be visualized with the Google Chrome web browser, by navigating to the `chrome://tracing` URL. Other tools can be used to visualize or analyze traces, like [Catapult](https://chromium.googlesource.com/catapult).
The **Trace Event listener** will construct a full event trace and write the events to one or more [Google Trace Event](https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit#) trace files. The files can be visualized with the Google Chrome web browser, by navigating to the <https://ui.perfetto.dev> URL. <!-- Other tools can be used to visualize or analyze traces, like [Catapult](https://chromium.googlesource.com/catapult). -->

## Policy Listener

Expand Down
7 changes: 5 additions & 2 deletions doc/webdocs/docs/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
![image](img/logo-cropped.png "APEX")
<!--![image](https://github.com/khuck/xpress-apex/raw/master/doc/logo-cropped.png "APEX") -->
<!--![image](https://github.com/UO-OACISS/apex/raw/master/doc/logo-cropped.png "APEX") -->

# APEX: Autonomic Performance Environment for eXascale

Expand All @@ -11,7 +11,10 @@ In short, APEX is an introspection and runtime adaptation library for asynchrono
APEX provides an API for measuring actions within a runtime. The API includes methods for timer start/stop, as well as sampled counter values. APEX is designed to be integrated into a runtime, library and/or application and provide performance introspection for the purpose of runtime adaptation. While APEX *can* provide rudimentary post-mortem performance analysis measurement, there are many other performance measurement tools that perform that task more robustly (such as TAU <http://tau.uoregon.edu>). That said, APEX includes an event listener that integrates with the TAU measurement system, so APEX events can be forwarded to TAU and collected in a TAU profile and/or trace to be used for post-mortem performance anlaysis.

## Runtime Adaptation
APEX provides a mechanism for dynamic runtime behavior, either for autotuning or adaptation to changing environment. The infrastruture that provides the adaptation is the *Policy Engine*, which executes policies either periodically or triggered by events. The policies have access to the performance state as observed by the APEX introspection API. APEX is integrated with Active Harmony <http://www.dyninst.org/harmony> to provide dynamic search for autotuning.
APEX provides a mechanism for dynamic runtime behavior, either for autotuning or adaptation to changing environment. The infrastruture that provides the adaptation is the *Policy Engine*, which executes policies either periodically or triggered by events. The policies have access to the performance state as observed by the APEX introspection API. APEX has several built in search strategies, including exhaustive, random, simulated annealing, and hill climibing. APEX is also integrated with Active Harmony <http://www.dyninst.org/harmony> to provide dynamic search using the Nelder Mead algorithm.

## Citing APEX
Please use the following citation: <https://doi.org/10.1109/ESPM256814.2022.00008>

## References & APEX-related Publications
1. <a name="fn1"></a> Thomas Sterling, Daniel Kogler, Matthew Anderson, and Maciej Brodowicz. "SLOWER: A performance model for Exascale computing". *Supercomputing Frontiers and Innovations*, 1:42–57, September 2014. <http://superfri.org/superfri/article/view/10>
Expand Down
Loading

0 comments on commit ad9e12c

Please sign in to comment.