Skip to content

Releases: UO-OACISS/apex

Patch release v2.3.2

13 Apr 18:07
Compare
Choose a tag to compare

Patch release for bug fixes.

Commits in this release:

  • view commit • Updating documentation
  • view commit • Merge branch 'develop' of github.com:khuck/xpress-apex into develop
  • view commit • Checking for nvcc 10 and gcc 8 and setting flags accordingly
  • view commit • Adding periodic plugin example, enabling static global constructors and destructors
  • view commit • Adding pthread wrapper and screen_output to policy plugin example
  • view commit • Update README.md
  • view commit • Re-enablling ability to get vector of available profiles, updated periodic example
  • view commit • Don't pin threads by default, it's kind of broken on summit
  • view commit • Fixing HPX build due to static global constructor
  • view commit • Fixing bug #134. Changing from pthread_setaffinity_np() to sched_get/setaffinity()
  • view commit • Fixing issue #135 When tracking CPU/GPU activity, the memory allocation counters should be associated with the thread making the call, when writing to OTF2 traces. This change allows for an optional argument to the apex::sample_value call that indicates whether the counter is assocaited with the specific thread or the process as a whole (the default).
  • view commit • Fixing #137. Now explicitly tracking all memory allocations and frees on both the host and the device.
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Re-enable pinning by default
  • view commit • Fixing #136. Now have the ability to capture task tree, not just graph. No more cycles!
  • view commit • Adding dependency_tree class
  • view commit • Fixing build errors for -std=c++11 compliance
  • view commit • Initial memory wrapper, bugs everywhere
  • view commit • Adding additional MPI rank detection support
  • view commit • Fixing build issue with HPX due to modified sample_value function
  • view commit • Fixing cuda 10.1 build errors.
  • view commit • Fixing gperftool config by finding correct include location
  • view commit • Fixing gperftool config by finding correct include location
  • view commit • Removing some high-overhead and useless counters
  • view commit • Working memory wrapper for malloc/free, removing pointers from name demangling due to instability
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Adding support for calloc and realloc
  • view commit • Fixing comment
  • view commit • Adding memory wrapper code for HPX configurations
  • view commit • Updating copyright to 2021
  • view commit • Fixing measurement output when dump is called multiple times.
  • view commit • Fixing tasktree processing for non-timers, adding to apex_exec script
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Fixing elapsed time in graphs and shortening timer names by not including full file name and path by default
  • view commit • Fixing concurrency handler static global variable
  • view commit • Fix HPX barriers in OTF2 output
  • view commit • Merge pull request #143 from severinstrobl/otf2_hpx_barriers
  • view commit • Enabling LLVM 11 to build cuda examples
  • view commit • Forgot to set profiler to "stopped" when adding async activity.
  • view commit • Removing APEX counters (llvm won't link them?)
  • view commit • Cleaning up timers. We had been using a custom clock in order to use rdtsc on Intel platforms, but that's kind of pointless. It becomes a nightmare when trying to convert for OTF2 traces, and CUDA (and other GPUs) only provide timestamps in nanoseconds. Therefore, all timing is assumed to be done in nanoseconds now.
  • view commit • Flush CUPTI before dumping.
  • view commit • Need to move forward declaration.
  • view commit • Only override the rank if suspect it's wrong
  • view commit • Updating version number
  • view commit • Updating version number.
  • view commit • Merge branch 'develop'
  • Version 2.3.1 - Patch Release

    28 Jan 17:16
    Compare
    Choose a tag to compare

    This release patches CMake variables to make them consistent across the project, and some bug fixes for OpenMP initial/implicit tasks and support for CUDA 11.

    Commits in this release:

    • view commit • Beginning the process of cleaning up the CMAKE config
    • view commit • Testing the CMake cleanup before merge with develop
    • view commit • Merge branch 'cmake_cleanup' into develop
    • view commit • Removing explicit check for cori or edison and replacing it with a check for Cray KNL
    • view commit • Fixing anonymous OpenMP regions, where appopriate For implicit tasks, barriers and barrier wait events, either there is no codeptr associated with it, or there is. For implicit tasks, use the codeptr of the parent. For barriers, don't do anything. An anonymous barrier usually means that the thread is idle between parallel regions.
    • view commit • Correcting support for CUDA 11 in NVML interface.
    • view commit • Changing USE_LM_SENSORS to APEX_WITH_LM_SENSORS
    • view commit • Fixing paths for lm sensors in HPX configs, and correcting cmake warnings.
    • view commit • Updating APEX version
    • view commit • Fixing demangle config for HPX

    2.3.0 Release

    08 Jan 17:15
    Compare
    Choose a tag to compare

    This release contains many bug fixes, and some new features. New features include:

  • Kokkos support
  • OpenACC profiling support
  • NVIDIA CUDA/CUPTI support
  • NVIDIA NVML support
  • RAJA support
  • Compiler-based instrumentation support
  • Additional /proc/self data
  • Disable RDTSC timer on `x86_64` architectures
  • Minimal MPI profiling support
  • HPX reduction for OTF2 event unification (when HPX networking enabled)
  • Ported to PGI, Intel compilers
  • Updated `apex_exec` script for parsing command line arguments
  • Event filtering
  • Documentation updates

  • All of the commits for this release:

  • view commit • Adding new "pre-shutdown" event for listeners The profiler_listener, otf2_listener and trace_event_listener all need to take a timestamp when the program is finished, but when CUPTI asynchronous processing has to happen, that can dialate the trace because the final timestamp doesn't get taken until long after the buffers are processed. Now, the timestamp is taken before the buffers are processed. All asynchronous background processing also needs to be disabled, so that there aren't new events in the trace after the last timestamp.
  • view commit • Adding kokkos support.
  • view commit • Porting to PGI on Summit
  • view commit • Adding kokkos support.
  • view commit • Fixing bug in memcpy activity The stream ID wasn't getting captured, causing overlapping timers in the OTF2 trace.
  • view commit • Add MPI_Finalize wrapper When configuring APEX with MPI support, wrap the MPI_Finalize function so that we can use MPI functions during OTF2 event unification instead of the filesystem.
  • view commit • Unify the final timestamp At the end of exeuction, exchange final timestamps so that the OTF2 trace has an accurate final timestamp.
  • view commit • Don't finalize profiles if background stats not computed
  • view commit • Adding MPI to some CUDA examples to test the event unification support.
  • view commit • Debugging kokkos support on summit
  • view commit • Merge branch 'kokkos' of github.com:khuck/xpress-apex into kokkos
  • view commit • Merge branch 'develop' of github.com:khuck/xpress-apex into develop
  • view commit • Adding kokkos support.
  • view commit • Debugging kokkos support on summit
  • view commit • Allow HPX configs to disable RDTSC
  • view commit • Updating to renamed perfstubs API calls
  • view commit • Merge branch 'kokkos' of github.com:khuck/xpress-apex into kokkos
  • view commit • Fixing race conditions between processes when doing OTF2 event unification
  • view commit • Merge branch 'kokkos' into develop
  • view commit • Two changes: making Jupyter support a runtime option and updating some OMPT initialization. This is the beginning of the process of updating OMPT support to fully support OpenMP 5.0 including target directives.
  • view commit • Fixing task dependencies in OpenMP/OMPT
  • view commit • Check if OMPT was initialized before forcing shutdown
  • view commit • Merge branch 'master' into develop
  • view commit • Adding simple OpenMP test
  • view commit • Debugging with Intel 20 compiler. Still lots of shutdown problems.
  • view commit • Updates for OpenMP and OpenACC support. Target offloading with OpenACC to CUDA is now supported.
  • view commit • Adding NVML support
  • view commit • Working NVML support for utilization
  • view commit • Adding NVML find support for HPX configs
  • view commit • Fixing cmake error with nvml
  • view commit • Adding lots more NVML data. Clock, power, temp, PCIe throughput
  • view commit • Enabling /proc/self/status by default
  • view commit • Be smarter about which devices to monitor with NVML
  • view commit • Adding driver support for when changing devices, to make sure NVML is capturing the right device
  • view commit • Adding NVML nvlink statistics
  • view commit • Don't build openmp examples with compiler without openmp support
  • view commit • Merge branch 'master' into develop
  • view commit • Silly CMake bug
  • view commit • Removing contention in google event tracer, but still have to flush buffers occasionally.
  • view commit • Adding command line argument processing for apex_exec script
  • view commit • Fixing doxygen warning
  • view commit • Updating documentation to v2.2.0
  • view commit • Fixing label for host-allocated memory in Cuda
  • view commit • Always delete OTF2 archive if exists at startup
  • view commit • Updating copyright and documentation
  • view commit • Fixing support for std::unique_ptr with clang
  • view commit • Updating readthedocs documentation
  • view commit • removing debug...
  • Read more

    Version 2.2.0

    05 Aug 16:38
    Compare
    Choose a tag to compare

    This release contains many updates and fixes. Of note is new support for CUDA/CUPTI events, and the ability to detect MPI applications even though HPX or APEX aren't configured with MPI support.

    Changes:

    • view commit • Change to personal fork of concurrentqueue for stability
    • view commit • Cleaning up clang pedantic errors
    • view commit • Tweaking build system to support Windows
    • view commit • Merge pull request #122 from STEllAR-GROUP/fixing_windows_support
    • view commit • Adding annotation for process_profiles task
    • view commit • Cleaning up the dot/graphviz output
    • view commit • Adding "untied timers" option. With this option enabled, a profiler can be started on one OS thread and stopped on another. APEX won't keep track of the profiler stack.
    • view commit • Fixing unit conversion when writing out TAU profiles
    • view commit • Add capture of /proc/self/status Threads value
    • view commit • Capture the number of OS context switches
    • view commit • Cleaning up thread swap test
    • view commit • Adding additional error messages to PAPI component support
    • view commit • Debugging PAPI error checking
    • view commit • Updating to support binutils 2.34 API changes, adding pthread.h include header where needed
    • view commit • Updating deprecated HPX headers
    • view commit • First step in adding CUDA support Adding a CUDA example and adding CUDA/CUPTI headers through CMake.
    • view commit • Adding another cuda example
    • view commit • Working kernel measurement
    • view commit • Basic callback and activity support enabled
    • view commit • Done with initial implementation
    • view commit • Disable thread affinity for HPX configurations
    • view commit • Minor change to support running in MPI environment when MPI is not used by HPX or the APEX configuration. This happens when HPX is configured without a parcel port, and APEX thinks all ranks are 0. This change adds a check for MPI environment variables to validate the MPI rank that was passed in.
    • view commit • Adding MPI rank/size detection support for MPICH ...which also covers MVAPICH, Intel, Cray, etc. Also added some PBS/torque support, but unfortunately they don't provide an environment variable that specifies the total number of ranks. Maybe in the future we could have that be a special APEX environment variable that specifies the total number of ranks, if needed.
    • view commit • First step in adding CUDA support Adding a CUDA example and adding CUDA/CUPTI headers through CMake.
    • view commit • Adding another cuda example
    • view commit • Working kernel measurement
    • view commit • Basic callback and activity support enabled
    • view commit • Done with initial implementation
    • view commit • Merge branch 'cuda_support' of github.com:khuck/xpress-apex into cuda_support
    • view commit • Adding CUDA task dependency support
    • view commit • task dependency working! When GPU callbacks are made, we map the correlation ID to the task_wrapper associated with the parent. Then the GPU activity can be linked to the parent that launched it. also added two more examples.
    • view commit • Working CUDA support with task graphs and correct annotations This commit contains a nasty bug in task_identifier, where any identifier string gets "in place" modified when demangled. That can cause problems later when if map of said task_identifiers is modified. This will be merged to develop when the full support with tracing is merged.
    • view commit • Adding basic CUDA counters to the support for kernels and memory transfers.
    • view commit • Adding HPX config support for CUDA/CUPTI
    • view commit • Minor typo in HPX configuration
    • view commit • More changes for HPX support
    • view commit • Testing with cuda 10.1 and fixing config Testing with older cuda revealed that some installations are different.
    • view commit • Fixing bugs in shutdown. During shutdown, the asynchronous buffers were processed but the static strings that some labels depended on went out of scope. So the strings got corrupted. This is fixed by using const char * strings instead of const std::string&. Also, the counters are way too much overhead, so they are now optional.
    • view commit • Adding Google Chrome trace event support
    • view commit • Working (rudimentary) Google Trace Event support. This support only handles timers, no counters (yet).
    • view commit • Merge branch 'chrome_trace_event' into develop
    • view commit • Fixing implementation of public profile processing function to work with gcc 8
    • view commit • Minor change to add cudart to the link
    • view commit • Merge branch 'cuda_support' of https://github.com/khuck/xpress-apex into cuda_support
    • view commit • Minor changes to CUDA support and Google trace The Google trace support needs to be refactored, but otherwise this seems to be working.
    • view commit • Merge...
    Read more

    Version 2.1.9

    22 Apr 14:35
    7402e50
    Compare
    Choose a tag to compare

    Bug fixes and updates to support changes in HPX.

    Bug fix and maintenance release, version 2.1.9

  • view commit • Adding spack and cmake to buildbot build process
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Initializing reset counter in profile constructor
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • should have tested before commit.
  • view commit • Fixing deadlock in policy shutdown, and segfault when profiles aren't processed before exit
  • view commit • Cleaning up the pedantic compiler flags And resolving pedantic compiler warnings.
  • view commit • Cleanup changes introduced a bug, this fixes it The cleanup changes caused APEX to request HPX to schedule profile processing during shutdown, but unfortunately HPX has already stopped by then. Instead, force synchronous processing of remaining profile data from the on_dump() event.
  • view commit • Fixing parallel buildbot for HPX builds
  • view commit • Still can't build more than 4 wide on ktau
  • view commit • Changing HPX tasks from actions to regular hpx::async calls
  • view commit • should have used hpx::apply()
  • view commit • Use moodycamel queue from hpx::concurrency namespace
  • view commit • Merge pull request #121 from msimberg/moodycamel
  • view commit • Merge pull request #120 from khuck/master
  • Version v2.1.8

    25 Mar 00:46
    Compare
    Choose a tag to compare

    Bug fixes and updates to support changes in HPX.

  • view commit • Fixing CSV output bug Only node 0 was getting written.
  • view commit • Adding MPI OTF2 test to make sure event unification works correctly
  • view commit • Expanding the C++ demo to make it more useful
  • view commit • Fixing performance bug with a lock being held tooo long when processing profile objects from the queues.
  • view commit • Cleaning up apex::reset behavior
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Updating to latest perfstubs API
  • view commit • Fixing mismatched apex_init() declaration
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Adding counters to CSV output.
  • view commit • Do write out the APEX_MAIN timer at exit.
  • view commit • Fixing location of perfstubs git repo so that checkouts happen correctly
  • view commit • Fixing location of perfstubs headers
  • view commit • Fixing git checkout for good this time
  • view commit • Ignoring the perfstubs directory git clean will wipe out the perfstubs directory without this change to .gitignore
  • view commit • Changing location of papi on test build system
  • view commit • Merge branch 'develop'
  • view commit • Updates to support fixes for HPX issue #4438
  • view commit • Merge pull request #119 from khuck/fixes_for_hpx_4441
  • view commit • Merge branch 'develop'
  • v2.1.7 Release to sync up with HPX v1.4.0

    10 Dec 23:20
    Compare
    Choose a tag to compare

    Bug fixes and refactoring to support new HPX modularization effort. APEX is no longer called from anywhere in HPX, but APEX does still make HPX calls. The previous circular dependency has been refactored out. HPX now has an external_timer class that provides a plugin API that APEX registers at program load. When HPX runs, the external_timer class will make callbacks to the registered library (APEX).

    List of commits:

    v2.1.6

    13 Nov 21:22
    Compare
    Choose a tag to compare

    Refactoring to remove circular dependency between HPX and APEX. libhpx no longer calls APEX directly, it is handled through a callback API.

  • view commit • Initial refactor to eliminate circular build dependency between APEX and HPX
  • view commit • Changing to explicit callback registrations for all events
  • view commit • Handle PAPI component read failures gracefully.
  • view commit • untangling circular dependency between APEX and HPX in cmake
  • view commit • Merge remote-tracking branch 'github/develop' into apex_callback_refactoring
  • view commit • fixing debug message
  • view commit • Restoring nested timers after yield/resume
  • view commit • Fixing scoped_thread and return from failed new_task
  • view commit • Merge branch 'apex_callback_refactoring' into develop
  • view commit • Splitting screen_output into verbose for environent variables
  • Version 2.1.5, last release before HPX integration refactoring

    13 Nov 17:56
    Compare
    Choose a tag to compare

    Bug fixes and improvements to 2.1.4.
    This release contains bug fixes and changes for HPX support. This is also the last release before the HPX integration refactoring is merged.

    Change log from recent git commits:

  • view commit • Making concurrentqueue an external dependency
  • view commit • Removing concurrentqueue from code base Using it as an external build dependency now
  • view commit • Merge branch 'concurrentqueue-update' into develop
  • view commit • updating for new release
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Adding PAPI RAPL power measurement
  • view commit • Adding PAPI RAPL power measurement
  • view commit • Adding NVML power measurement support from PAPI
  • view commit • Fixing compiler error from HPX with const_cast
  • view commit • updating 2.1.4 tag
  • view commit • Updating version
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Don't export the APEX library in HPX builds
  • view commit • Add hpx_assertion dependency
  • view commit • Fix build error on osx
  • view commit • Merge pull request #102 from msimberg/patch-2
  • view commit • Make sure the right branches are handled
  • view commit • Updating OMPT to OpenMP 5.0
  • view commit • Updating OMPT to OpenMP 5.0, testing with GCC 8
  • view commit • Updating .gitignore
  • view commit • updating buildbot ompt version
  • view commit • Add hardware module dependency + change header
  • view commit • Merge pull request #104 from aurianer/fix_hardware_header
  • view commit • Adding script to merge github pull request to origin
  • view commit • Build external apex, use of APEX_ROOT cmake var
  • view commit • Merge pull request #105 from aurianer/build_external_apex
  • view commit • Change include_directories to target_include_directories
  • view commit • Merge pull request #106 from aurianer/change_include_to_target_include
  • view commit • Initialize pd_reader correctly
  • view commit • Add HPX algorithms and hpx_format dependencies
  • view commit • Change include_directories to work with hpx target directives
  • view commit • Merge pull request #107 from aurianer/fix_modules_dependencies
  • view commit • Updating moved HPX header file location
  • view commit • Apparently NOEXPORT does nothing now, so add EXCLUDE_FROM_ALL
  • view commit • Updating HPX library dependencies
  • view commit • Updating list of hpx libraries.
  • view commit • Adding /proc/loadavg and changing to /proc/self/net
  • view commit • Adding lmsensors to HPX configuration
  • view commit • Adding lm_sensors support to HPX build
  • view commit • Adding all GPU component events from PAPI.
  • view commit • Adding lmsensors support to papi component.
  • view commit • Adding perfstubs implementation. Needs work.
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Fixed lmsensors papi component support.
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Refactoring PAPI component work.
  • view commit • Updating perftool implementation
  • view commit • Fixing perftool_implementation
  • view commit • Fix more HPX modules dependencies and deprecated headers
  • view commit • Merge pull request #108 from aurianer/fix_modules_dependencies
  • view commit • Adding buildbot build for HPX to test APEX
  • view commit • Finishing test step for HPX buildbot script.
  • view commit • Delete the /dev/shm/hpx directory before build
  • view commit • Adding another HPX library dependency.
  • view commit • Updating test step for HPX builds
  • view commit • Updating to match changes in HPX.
  • Read more

    v2.1.4 Release

    30 May 21:36
    Compare
    Choose a tag to compare

    Minor change to update to HPX build dependencies