You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, APEX tracks the cudaMalloc amounts, but doesn't track the total amount allocated. It relies on the periodic sampling of NVML counters, which can create blind spots. To avoid these blind spots, we should optionally track actually cudaMalloc and cudaFree locations and amounts. Each cudaMalloc call will increment an atomic counter of allocated memory bytes and insert into a map with the key as the address and the value the size. Then the cudaFree calls will use the address to look up the allocated size and decrement the atomic counter. This will be an optional feature, to avoid perturbation from contention for the map and the counter. Each malloc and free will result in an event to the OTF2 trace.
The text was updated successfully, but these errors were encountered:
Currently, APEX tracks the cudaMalloc amounts, but doesn't track the total amount allocated. It relies on the periodic sampling of NVML counters, which can create blind spots. To avoid these blind spots, we should optionally track actually cudaMalloc and cudaFree locations and amounts. Each cudaMalloc call will increment an atomic counter of allocated memory bytes and insert into a map with the key as the address and the value the size. Then the cudaFree calls will use the address to look up the allocated size and decrement the atomic counter. This will be an optional feature, to avoid perturbation from contention for the map and the counter. Each malloc and free will result in an event to the OTF2 trace.
The text was updated successfully, but these errors were encountered: