Releases: ROCm/omnitrace
v1.7.2
What's Changed
- Update containers workflow by @jrmadsen in #182
- Fix perfetto debug annotations of function parameters by @jrmadsen in #181
- Disable HSA API and activity by default by @jrmadsen in #183
- Dramatic improvement of post-processing critical trace data by @jrmadsen in #185
- OMNITRACE_ROCTRACER_DISCARD_INVALID=N by @jrmadsen in #186
- Fix LD_PRELOAD by @jrmadsen in #184
Full Changelog: v1.7.1...v1.7.2
v1.7.1: ROCm 5.3 and Ubuntu 22.04 Support
v1.7.0: omnitrace-sample
What's Changed
- Fix gotcha indexes for numa_gotcha by @jrmadsen in #161
- Bump version to 1.7.0 by @jrmadsen in #162
- Update timemory submodule by @jrmadsen in #160
- Fix building w/ hip, etc. but w/o rocprofiler by @jrmadsen in #159
- Crusher hackathon updates by @jrmadsen in #164
- Fix deadlocks during initialization by @jrmadsen in #167
- Fix spack builds when no ROCm .info/version files by @jrmadsen in #170
- Support for libbfd (binary file descriptor) by @jrmadsen in #168
- Resolve warnings/errors with extra warnings by @jrmadsen in #171
- omnitrace-sample by @jrmadsen in #169
- Raise default min number of instructions by @jrmadsen in #173
- Fix finalization segfaults by @jrmadsen in #174
Full Changelog: v1.6.0...v1.7.0
v1.6.0: NUMA, metadata, and improved perfetto arg data
v1.5.0: Sampling improvements, colorized logs
What's Changed
- Python noprofile by @jrmadsen in #138
- Python validation-external & builtin by @ratamima in #123
- Generic comm_data component by @jrmadsen in #132
- Static libstdcxx and python by @jrmadsen in #139
- Rework sampling and colorized logs by @jrmadsen in #140
- bump-version by @jrmadsen in #141
- Support sampling duration, sampling TIDs by @jrmadsen in #142
- Support tracing thread locks with perfetto by @jrmadsen in #143
Full Changelog: v1.4.0...v1.5.0
v1.4.0
What's Changed
- Fix RPATH handling by @jrmadsen in #122
- RPATH to rocprofiler_LIBRARY_DIR for ROCm < v5.2 by @jrmadsen in #126
- Advanced category for configuration options by @jrmadsen in #125
- Update config generation, join fix, sampler by @jrmadsen in #129
- offset thread ids where possible by @jrmadsen in #130
- Remove unused funcs + messages for excluding system lib by @jrmadsen in #133
- Fix some inconsistencies in debug messages w/in category_region by @jrmadsen in #135
- Verbose messages based on ROCP_ONLOAD_TRACE env by @jrmadsen in #131
- Update python install + build-tree setup by @jrmadsen in #128
- OMNITRACE_TRACE_THREAD_SPIN_LOCKS config by @jrmadsen in #134
- Enable TRACE_THREAD_RW_LOCKS and TRACE_THREAD_SPIN_LOCKS by default by @jrmadsen in #136
- Handle --advanced printing for config generation by @jrmadsen in #137
Full Changelog: v1.3.1...v1.4.0
v1.3.1: Fix ROCP_METRICS for ROCm 5.2
What's Changed
- ROCm environment fixes + workflow updates by @jrmadsen in #117
- Fix uploading release assets by @jrmadsen in #118
Full Changelog: v1.3.0...v1.3.1
v1.3.0: GPU HW Counters, RCCL, ROCTx, Python User Regions
Notable New Features
- Support for collecting GPU HW counters #84
OMNITRACE_ROCM_EVENTS
configuration variable
- Support for ROCTx #87
OMNITRACE_USE_ROCTX
configuration variable
- ROCm Collective Communication Library (RCCL) Support #93
OMNITRACE_USE_RCCLP
configuration variable
- Python User API #57
What's Changed
- Fix empty OMNITRACE_CONFIG_FILE and suppressing config and parsing by @jrmadsen in #81
- pthread_rwlock deadlock fix by @jrmadsen in #82
- Improved sampling performance by @jrmadsen in #74
- Combine ubuntu-focal-external.yml and ubuntu-focal.yml by @jrmadsen in #83
- GPU HW Counters via rocprofiler by @jrmadsen in #84
- Fix statistics type and use feature name indexes by @jrmadsen in #85
- Unified setup_environ b/t libomni and libomni-dl by @jrmadsen in #86
- Support ACTIVITY_DOMAIN_ROCTX by @jrmadsen in #87
- Fixes missing call to mpi_gotcha::update() by @jrmadsen in #88
- Support for disabling perfetto categories by @jrmadsen in #72
- Remove get_perfetto_output_filename().clear() by @jrmadsen in #89
- fix omnitrace print-* with libraries by @jrmadsen in #94
- Sampling Tweaks: disable sampling itimer by @jrmadsen in #95
- Replaces OMNITRACE_CONDITIONAL_BASIC_PRINT with OMNITRACE_VERBOSE by @jrmadsen in #97
- omnitrace builds timemory with TIMEMORY_USE_ROOFLINE=0 by @jrmadsen in #96
- Updated features docs [skip ci] by @jrmadsen in #98
- Fix warnings + Werror by @jrmadsen in #101
- Sampling use SIGRTMIN + N signals by @jrmadsen in #104
- Increase build timeouts by @jrmadsen in #107
- Updated documentation for hardware counters by @jrmadsen in #108
- Pthread category region by @jrmadsen in #102
- Release 1.3.0 preparations by @jrmadsen in #109
- Added new tests validating gotcha wrappers by @jrmadsen in #105
- Fix reliability when KOKKOS_PROFILE_LIBRARY is set in env by @jrmadsen in #103
- exit gotcha + remove DelayedInit state + rocm-smi + cleanup by @jrmadsen in #110
- Docker + build-release.sh + PAPI.cmake by @jrmadsen in #111
- Fix PAPI cpack packaging by @jrmadsen in #112
- Minor fixes by @jrmadsen in #113
- ubuntu cpack was building for rocm 5.2 twice by @jrmadsen in #114
- RCCL support by @jrmadsen in #93
- Fix dockerfile.opensuse by @jrmadsen in #115
- User regions in Python by @jrmadsen in #57
Full Changelog: v1.2.0...v1.3.0
v1.2.0: Auto-generate configs, function args in Perfetto
Notable Changes
General
- Rework submodule installation by @jrmadsen in #70
- This ensures that any/all vendored 3rd-party libraries are installed to an
omnitrace
subfolder in the lib directory, i.e.<prefix>/lib/omnitrace/
- This ensures that any/all vendored 3rd-party libraries are installed to an
Bug Fixes
- Fixes excluded-instr output, fini functions, tweaks MPI by @jrmadsen in #51
- Fixes OMNITRACE_SUPPRESS_CONFIG handling by @jrmadsen in #53
- Fix attaching to running process, i.e. omnitrace -p by @jrmadsen in #60
Enhancements
- omnitrace-avail generate config by @jrmadsen in #69
- tracing NS + category region component + MPI args by @jrmadsen in #52
- HIP API args in perfetto + new perfetto categories by @jrmadsen in #76
Deprecations
- Rename OMNITRACE_ROCM_SMI_DEVICES to OMNITRACE_SAMPLING_GPUS by @jrmadsen in #58
- Rename OMNITRACE_USE_THREAD_SAMPLING to OMNITRACE_USE_PROCESS_SAMPLING by @jrmadsen in #68
What's Changed
- Fixes the configuration file example by @jrmadsen in #45
- CI for OpenSUSE by @jrmadsen in #12
- Fixes excluded-instr output, fini functions, tweaks MPI by @jrmadsen in #50
- Fixes excluded-instr output, fini functions, tweaks MPI by @jrmadsen in #51
- Define new function attributes by @jrmadsen in #55
- Inclusive range for OMNITRACE_SAMPLING_CPUS by @jrmadsen in #54
- Fixes OMNITRACE_SUPPRESS_CONFIG handling by @jrmadsen in #53
- Remove reliance on MPI_Comm_rank by @jrmadsen in #56
- Fix find_path in omnitrace-dl by @jrmadsen in #59
- Improved the determination of MPI rank by @jrmadsen in #61
- Fix attaching to running process, i.e. omnitrace -p by @jrmadsen in #60
- Rename OMNITRACE_ROCM_SMI_DEVICES to OMNITRACE_SAMPLING_GPUS by @jrmadsen in #58
- Update PTL submodule by @jrmadsen in #63
- libomnitrace uses common headers by @jrmadsen in #62
- Update timemory submodule by @jrmadsen in #64
- Update dyninst submodule by @jrmadsen in #65
- adding perfetto-validation-script by @ratamima in #66
- Rename OMNITRACE_USE_THREAD_SAMPLING to OMNITRACE_USE_PROCESS_SAMPLING by @jrmadsen in #68
- tracing NS + category region component + MPI args by @jrmadsen in #52
- Fix PID resolution + OMNITRACE_VERSION + fix various configs by @jrmadsen in #71
- omnitrace-avail generate config by @jrmadsen in #69
- Rework submodule installation by @jrmadsen in #70
- Fix docs using -D instead of -G by @jrmadsen in #73
- Adds test which validates errors for missing configs by @jrmadsen in #75
- Use concurrency in GitHub Actions + remove cancelling by @jrmadsen in #77
- HIP API args in perfetto + new perfetto categories by @jrmadsen in #76
- Handle OMNITRACE_ENABLED + minor updates by @jrmadsen in #78
New Contributors
Full Changelog: v1.1.1...v1.2.0
Instructions for installing binary releases
See the documentation here to determine whether your OS supports installation via the pre-built installation scripts. It is possible to use these scripts on similar Linux flavors, e.g. the Ubuntu 20.04 (focal fossa) installer is compatible with Debian 11 (bullseye/sid).
- Download the binary for your OS and with the desired ROCm compatibility (if any)
- The supported Python versions for all installers in this release are 3.6, 3.7, 3.8, 3.9, and 3.10
- Install the dependencies as needed (if not already installed)
- All binary installers require installing OpenMP, e.g.,
apt-get install libgomp1
on Ubuntu - Packages with ROCm in the name require installing ROCm: instructions can be found here
- PAPI and OMPT support have no runtime dependencies and thus require no additional installations.
- All installations have partial MPI support
- If you do not have Python installed on your system and do not intend to use the Python capabilities, installing Python is not necessary.
- All binary installers require installing OpenMP, e.g.,
- Create the installation directory for omnitrace, e.g.
mkdir /opt/omnitrace
- Run the installer script (see example below)
- Recommendation: use
--exclude-subdir
option
- Recommendation: use
- Setup the environment via
setup-env.sh
or environment-modules
a. Source thesetup-env.sh
script in<prefix>/share/omnitrace
, e.g.source /opt/omnitrace/share/omnitrace/setup-env.sh
b.module use <prefix>/share/modulefiles
andmodule load omnitrace/1.2.0
- Verify
which omnitrace
andwhich omnitrace-avail
return<prefix>/bin/omnitrace
and<prefix>/bin/omnitrace-avail
Example for omnitrace-1.2.0-ubuntu-20.04-ROCm-50000-PAPI-OMPT-Python3.sh
$ mkdir ${HOME}/omnitrace
$ ./omnitrace-1.2.0-ubuntu-20.04-ROCm-50000-PAPI-OMPT-Python3.sh --prefix=/opt/omnitrace --skip-license --exclude-subdir
omnitrace Installer Version: 1.2.0, Copyright (c) Advanced Micro Devices, Inc.
This is a self-extracting archive.
The archive will be extracted to: /opt/omnitrace
Using target directory: /opt/omnitrace
Extracting, please wait...
Unpacking finished successfully
$ source /opt/omnitrace/share/omnitrace/setup-env.sh
$ which omnitrace
/opt/omnitrace/bin/omnitrace
$ which omnitrace-avail
/opt/omnitrace/bin/omnitrace-avail
Enabling CPU Hardware Counters
In order to enable collecting CPU hardware counters, the value of /proc/sys/kernel/perf_event_paranoid
may need to be changed.
The default value is 2. To update /proc/sys/kernel/perf_event_paranoid
run:
echo <VALUE> | sudo tee /proc/sys/kernel/perf_event_paranoid
Value | CPU Hardware Counter Capabilities |
---|---|
-1 | Allow use of (almost) all events by all users. Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK |
>=0 | Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN. Disallow raw tracepoint access by users without CAP_SYS_ADMIN |
>=1 | Disallow CPU event access by users without CAP_SYS_ADMIN |
>=2 | Disallow kernel profiling by users without CAP_SYS_ADMIN |