Skip to content

Commit

Permalink
restore code-centric view.
Browse files Browse the repository at this point in the history
  • Loading branch information
jmellorcrummey committed Sep 29, 2018
1 parent e192eca commit e33feb4
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions doc/manual/HPCToolkit-users-manual.tex
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,11 @@ \chapter{Introduction}
binary analysis exclusively.
On x86\_64 processors, HPCToolkit employs both strategies in an integrated fashion.

\begin{figure}[t]
\centering{\includegraphics[width=.8\textwidth]{fig/hpctoolkit-code-centric}}
\caption{A code-centric view of an execution of the University of Chicago's FLASH code executing on 8192 cores of a Blue Gene/P. This bottom-up view shows that 16\% of the execution time was spent in IBM's DCMF messaging layer. By tracking these costs up the call chain, we can see that most of this time was spent on behalf of calls to {\tt pmpi\_allreduce} on line 419 of {\tt amr\_comm\_setup}.}
\label{fig:code-centric}
\end{figure}

\begin{figure}[t]
\centering{\includegraphics[width=.8\textwidth]{fig/hpctoolkit-thread-centric}}
Expand All @@ -234,7 +239,8 @@ \chapter{Introduction}
\HPCToolkit{} assembles performance measurements into a call path profile that associates the costs of each function call with its full calling context.
In addition, \HPCToolkit{} uses binary analysis to attribute program performance metrics with uniquely detailed precision -- full dynamic calling contexts augmented with information about call sites, inlined functions and templates, loops, and source lines.
Measurements can be analyzed in a variety of ways: top-down in a calling context tree, which associates costs with the full calling context in which they are incurred; bottom-up in a view that apportions costs associated with a function to each of the contexts in which the function is called; and in a flat view that aggregates all costs associated with a function independent of calling context.
This multiplicity of code-centric perspectives is essential to understanding a program's performance for tuning under various circumstances. \HPCToolkit{} also supports a thread-centric perspective, which enables one to see how a performance metric for a calling context differs across threads, and a time-centric perspective, which enables a user to see how an execution unfolds over time. Figures~\ref{fig:code-centric}--\ref{fig:time-centric} show samples of the code-centric, thread-centric, and time-centric views.
This multiplicity of code-centric perspectives is essential to understanding a program's performance for tuning under various circumstances.
\HPCToolkit{} also supports a thread-centric perspective, which enables one to see how a performance metric for a calling context differs across threads, and a time-centric perspective, which enables a user to see how an execution unfolds over time. Figures~\ref{fig:code-centric}--\ref{fig:time-centric} show samples of HPCToolkit's code-centric, thread-centric, and time-centric views.

By working at the machine-code level, \HPCToolkit{} accurately measures and attributes costs in executions of multilingual programs, even if they are linked with libraries available only in binary form.
\HPCToolkit{} supports performance analysis of fully optimized code -- the only form of a program worth measuring; it even measures and attributes performance metrics to shared libraries that are dynamically loaded at run time.
Expand Down Expand Up @@ -268,7 +274,8 @@ \chapter{\HPCToolkit{} Overview}

\HPCToolkit{}'s work flow is organized around four principal capabilities, as shown in Figure~\ref{fig:hpctoolkit-overview:a}:
\begin{enumerate}
\item \emph{measurement} of context-sensitive performance metrics while an application executes;
\item \emph{measurement} of context-sensitive performance metrics using call-stack unwinding
while an application executes;
\item \emph{binary analysis} to recover program structure from application binaries;
\item \emph{attribution} of performance metrics by correlating dynamic performance metrics with static program structure; and
\item \emph{presentation} of performance metrics and associated source code.
Expand Down

0 comments on commit e33feb4

Please sign in to comment.