This repository has been archived by the owner on Jul 31, 2024. It is now read-only.
Releases: NERSC/timemory
Releases · NERSC/timemory
macOS Python Fixes
Python tools submodule + various stability fixes
- Python gotcha fixes
- Fixed issues with mallocp segfaulting from Python
- Fixed storage merge() segfaulting
- New Python tools submodule (timemory.tools)
- tools.function_wrappers combines {start,stop}_{mpip,ompt,ncclp,mallocp}
into one configurable handle and provides decorator + context-manager features
- tools.function_wrappers combines {start,stop}_{mpip,ompt,ncclp,mallocp}
- New Python functions which are used within tools.function_wrappers
- timemory.start_function_wrappers
- timemory.stop_function_wrappers
- Fixed timemory-python-line-profiler script calling timemory.profiler
- API change in ring_buffer template
- read/write member functions return pointer to object read/written to
instead of bytes
- read/write member functions return pointer to object read/written to
- API change in storage and tsettings
- Classes are declared as final to optimize any vtable calls
- Removed runtime_configurable restriction for do_enumerator_generate
- This enables user_bundles to be used again in Python
- Added operation::python_class_name
- Updated examples:
- ex_python_bindings (and libex_python_bindings)
- Fix to get_hash_identifier
- Removed concurrency comparison when generating a diff b/t two runs
- Fixed issues with popen.cpp guarding with TIMEMORY_WINDOWS but never defined
pytimem fix + various build system improvements
- pytimem fix
- fix missing import of component_bundle and component_tuple
- added additional python tests
- Ability to build with static libraries: python bindings, mpip library, mallocp library, ompt library, ncclp library, KokkosP libraries
- Setting TIMEMORY_BUILD_PYTHON to OFF now results in searching for external pybind11 install
- Renamed some CMake files in cmake/Modules
- Updated caliper and gotcha submodules to support {CALIPER,GOTCHA}INSTALL{CONFIG,HEADER} options
- Added TIMEMORY_INSTALL_PYTHON option
- Fixed BUILD_STATIC_LIBS=ON + CMAKE_POSITION_INDEPENDENT_CODE=ON
- Fixed TIMEMORY_USE_CUDA=ON + TIMEMORY_REQUIRE_PACKAGES=ON to fail
- If TIMEMORY_REQUIRED_PACKAGES=OFF, search for packages first before adding submodule
- Extended setup.py to support more options and support non-development install (no headers or cmake config)
- Removed TIMEMORY_EMBED_PYTHON option
- Disable timemory-jump when no shared libraries are built since dlopen isn't possible
- Replaced allocator member functions construct, destroy, allocate, deallocate with calls to static functions of allocator traits
- added support for CMAKE_ARGS env variable in setup.py
- remove absolute rpath when SKBUILD/SPACK_BUILD (since these have staging directories)
- timemory-{c,cxx,fortran} alias libraries in build tree
- toggled python function profiler to not include line number by default
- This can cause strange results when generators are used
Compiler instrumentation + Fortran module + New tool libraries + NCCL support + NVML support + Python tracing + Hatchet + User Metadata + CUPTI PCSampling
- Numerous stability fixes
- Fortran module
- Compiler instrumentation
- NCCL support
- timemory-mallocp
- timemory-ncclp
- timemory-nvml
- Python line-by-line tracing
- I/O {read,write}_{char,bytes}
- Network stats components
- libunwind support
- CMake minimum upgraded to 3.15
- Type-traits for tree/flat/timeline
- Hierarchical serialization (hatchet support)
- Concepts
- Improved settings
- Python tracer (line-by-line)
- CTestNotes support
- Command-line options for settings
- Migrated cereal to internal (i.e.
cereal::
->tim::cereal::
) - Dramatically improved Windows support
- Improved kokkos support
- Command-line options
- Print help
- XML serialization support
- Shared caches for components
- Support for C++17
string_view
- Python bindings to storage classes
- Windows support for different CPU timers
- CUDA Cupti PCSampling support (CUDA v11+)
- User metadata
- Sampling support in opaque (i.e. within user-bundles)
- Static polymorphic base for bundlers
- Namespace re-organization
- CUDA compilation with Clang compiler
- Piecewise installation
- timem support md5sum hashing of command-line
papi_threading
settingis_invalid
in base_state- New operations
stack_push
stack_pop
insert
set_depth_change
set_is_flat
set_is_on_stack
set_is_invalid
set_iterator
get_is_flat
get_is_invalid
get_is_on_stack
get_depth
get_storage
get_iterator
New command-line tools, dynamic instrumentation, profiling libraries, python profiling, C++14 migration
- New command line tools
timemory-run
for (Linux) dynamic instrumentation supporttimemory-avail
for component/settings/hw-counter availabilitytimem-mpi
fortimem
+ MPItimemory-python-profiler
for python profilingtimemory-python-line-profiler
for python line-by-line profiling
- New instrumentation libraries
- Kokkos profiling libraries
- MPI profiling libraries
- OpenMP profiling libraries
- New components
- CrayPAT components
- AllineaMap components
- Additional Caliper components
papi_vector
data_tracker
for tracking values in application
- Aggregation of MPI/UPC++ per-process results
- New variadic bundlers
component_bundle
,auto_bundle
,lightweight_tuple
- Functional alternative to variadic bundlers
MT fix and integral_constant for priority
- Storage fix for MT
- Previously, when a thread had multiple entries at a depth of +1 from master bookmark, only the first subgraph from thread was merged into master (it did not appear to affect flat-profiles though)
trait::start_priority<T>
andtrait::stop_priority<T>
use integral_constant instead of true/false- Updated copyright
Modularity Support
- Essentially re-written from scratch to support modularity
- This version supports the following "components" which can be assembled into a multiplexing measurement handle
COMPONENT |
---|
caliper |
cpu_clock |
cpu_roofline<Types...> |
cpu_util |
cuda_event |
cuda_profiler |
cupti_activity |
cupti_counters |
data_rss |
gotcha<size_t, Bundle, Diff> |
gperf_cpu_profiler |
gperf_heap_profiler |
gpu_roofline<Types...> |
likwid_nvmon |
likwid_perfmon |
monotonic_clock |
monotonic_raw_clock |
num_io_in |
num_io_out |
num_major_page_faults |
num_minor_page_faults |
num_msg_recv |
num_msg_sent |
num_signals |
num_swap |
nvtx_marker |
page_rss |
papi_array<size_t> |
papi_tuple<int...> . |
peak_rss |
priority_context_switch |
process_cpu_clock |
process_cpu_util |
read_bytes |
stack_rss |
system_clock |
tau_marker |
thread_cpu_clock |
thread_cpu_util |
trip_count |
user_bundle<size_t, Tag> |
user_clock |
virtual_memory |
voluntary_context_switch |
vtune_event |
vtune_frame |
wall_clock |
written_bytes |
Release v2.3.0
- Release v2.3.0
- This is a tag of the version of TiMemory before the string abi modifications
Performance improvement + C interface + env control
- Significant performance improvement (~2x)
- new C interface for TiMemory
- requires variable assignment and freeing
- void* atimer = TIMEMORY_AUTO_TIMER("")
- FREE_TIMEMORY_AUTO_TIMER(atimer)
- requires variable assignment and freeing
- command-line tools: timem (UNIX-only) and pytimem
- Environment control
- TIMEMORY_VERBOSE
- TIMEMORY_DISABLE_TIMER_MEMORY
- TIMEMORY_NUM_THREADS_ENV
- TIMEMORY_NUM_THREADS
- TIMEMORY_ENABLE
- TIMEMORY_TIMING_FORMAT
- TIMEMORY_TIMING_PRECISION
- TIMEMORY_TIMING_WIDTH
- TIMEMORY_TIMING_UNITS
- TIMEMORY_TIMING_SCIENTIFIC
- TIMEMORY_MEMORY_FORMAT
- TIMEMORY_MEMORY_PRECISION
- TIMEMORY_MEMORY_WIDTH
- TIMEMORY_MEMORY_UNITS
- TIMEMORY_MEMORY_SCIENTIFIC
- TIMEMORY_TIMING_MEMORY_FORMAT
- TIMEMORY_TIMING_MEMORY_PRECISION
- TIMEMORY_TIMING_MEMORY_WIDTH
- TIMEMORY_TIMING_MEMORY_UNITS
- TIMEMORY_TIMING_MEMORY_SCIENTIFIC
- Ability of push/pop default formatting
- improved thread-local singleton using C++ shared_ptrs
- automatic merge and deletion of manager instances at sub-thread exit
- Hard-code python exe into timemory python scripts
- Various fixes (plotting, argparse, etc.)
- Minor fix to avoid very rare FPE when serializing
- fix to TiMemoryConfig.cmake when installed via sudo
- self-cost available in manager + plotting safeguards
- Improved singleton deletion
- alternative colors for when len(_types) == 1 in plotting
- plotting label fix
TiMemoryConfig.cmake fixes
- Patches for TiMemoryConfig.cmake
- no longer using add_library alias
- fix for when TIMEMORY_USE_PYTHON_BINDINGS=OFF