This page outlines ways to generate and view performance data for your program using tools available at OSC.
Intel Tools
This section describes how to use performance tools from Intel. Make sure that you have an Intel module loaded to use these tools.
Intel VTune
Intel VTune is a tool to generate profile data for your application. Generating profile data with Intel VTune typically involves three steps:
1. Prepare the executable for profiling.
You need executables with debugging information to view source code line detail: re-compile your code with a -g
option added among the other appropriate compiler options. For example:
mpicc wave.c -o wave -g -O3
2. Run your code to produce the profile data.
Profiles are normally generated in a batch job. To generate a VTune profile for an MPI program:
mpiexec <mpi args> amplxe-cl <vtune args> <program> <program args>
where <mpi args>
represents arguments to be passed to mpiexec, <program>
is the executable to be run, <vtune args>
represents arguments to be passed to the VTune executable amplxe-cl, and <program args>
represents arguments passed to your program.
For example, if you normally run your program with mpiexec -n 12 wave_c
, you would use
mpiexec -n 12 amplxe-cl -collect hotspots -result-dir r001hs wave_c
To profile a non-MPI program:
amplxe-cl <vtune args> <program> <program args>
The profile data is saved in a .map file in your current directory.
As a result of this step, a subdirectory that contains the profile data files is created in your current directory. The subdirectory name is based on the -result-dir argument and the node id, for example, r001hs.o0674.ten.osc.edu
.
3. Analyze your profile data.
You can open the profile data using the VTune GUI in interactive mode. For example:
amplxe-gui r001hs.o0674.ten.osc.edu
One should use an OnDemand VDI (Virtual Desktop Interface) or have X11 forwarding enabled (see Setting up X Windows). Note that X11 forwarding can be distractingly slow for interactive applications.
Intel ITAC
Intel Trace Analyzer and Collector (ITAC) is a tool to generate trace data for your application. Generating trace data with Intel ITAC typically involves three steps:
1. Prepare the executable for tracing.
You need to compile your executbale with -tcollect
option added among the other appropriate compiler options to insert instrumentation probes calling the ITAC API. For example:
mpicc wave.c -o wave -tcollect -O3
2. Run your code to produce the trace data.
mpiexec -trace <mpi args> <program> <program args>
For example, if you normally run your program with mpiexec -n 12 wave_c
, you would use
mpiexec -trace -n 12 wave_c
As a result of this step, .anc, .f, .msg, .dcl, .stf, and .proc files will be generated in your current directory.
3. Analyze the trace data files using Trace Analyzer
You will need to use traceanalyzer
to view the trace data. To open Trace Analyzer:
traceanalyzer /path/to/<stf file>
where the base name of the .stf file will be the name of your executable.
One should use an OnDemand VDI (Virtual Desktop Interface) or have X11 forwarding enabled (see Setting up X Windows) to view the trace data. Note that X11 forwarding can be distractingly slow for interactive applications.
Intel APS
Intel's Application Performance Snapshot (APS) is a tool that provides a summary of your application's performance . Profiling HPC software with Intel APS typically involves four steps:
1. Prepare the executable for profiling.
Regular executables can be profiled with Intel APS. but source code line detail will not be available. You need executables with debugging information to view source code line detail: re-compile your code with a -g
option added among the other approriate compiler options. For example:
mpicc wave.c -o wave -tcollect -O3
2. Run your code to produce the profile data directory.
Profiles are normally generated in a batch job. To generate profile data for an MPI program:
mpiexec -trace <mpi args> <program> <program args>
where <mpi args>
represents arguments to be passed to mpiexec, <program>
is the executable to be run and <program args>
represents arguments passed to your program.
For example, if you normally run your program with mpiexec -n 12 wave_c
, you would use
mpiexec -n 12 wave_c
To profile a non-MPI program:
aps <program> <program args>
The profile data is saved in a subdirectory in your current directory. The directory name is based on the date and time, for example, aps_result_YYYYMMDD/.
3. Generate the profile file from the directory.
To generate the html profile file from the result subdirectory:
aps --report=./aps_result_YYYYMMDD
to create the file aps_report_YYYYMMDD_HHMMSS.html.
4. Analyze the profile data file.
You can open the profile data file using a web browswer on your local desktop computer. This option typically offers the best performance.
ARM Tools
This section describes how to use performance tools from ARM.
ARM MAP
Instructions for how to use MAP is available here.
ARM DDT
Instructions for how to use DDT is available here.
ARM Performance Reports
Instructions for how to use Performance Reports is available here.
Other Tools
This section describes how to use other performance tools.
HPC Toolkit
Rice University's HPC Toolkit is a collection of performance tools. Instructions for how to use it at OSC is available here.
TAU Commander
TAU Commander is a user interface for University of Oregon's TAU Performance System. Instructions for how to use it at OSC is available here.