PTLsim maintains a huge number of statistical counters and data points during the simulation process, and can optionally save this data to a file by using the ``-stats filename'' configuration option. The data store is a binary file format used to efficiently capture large quantities of statistical information for later analysis. This file format supports storing multiple regular or triggered snapshots of all counters. Snapshots can be subtracted, averaged and extensively manipulated, as will be described later on.
PTLsim makes it trivial to add new performance counters to the statistics data tree. All counters are defined in stats.h as a tree of nested structures; the top-level PTLsimStats structure is mapped to the global variable stats, so counters can be directly updated from within the code by simple increments, e.g. stats.xxx.yyy.zzz.countername++. Every node in the tree can be either a struct, W64 (64-bit integer), double (floating point) or char (string) type; arrays of these types are also supported. In addition, various attributes, described below, can be attached to each node or counter to specify more complex semantics, including histograms, labeled arrays, summable nodes and so on.
PTLsim comes with a special script, dstbuild (``data store template builder'') that parses stats.h and constructs a binary representation (a ``template'') describing the structure; this template data is then compiled into PTLsim. Every time PTLsim creates a statistics file, it first writes this template, followed by the raw PTLsimStats records and an index of those records by name. In this way, the complete data store tree can be reconstructed at a later time even if the original stats.h or PTLsim version that created the file is unavailable. This scheme is analogous to the separation of XML schemas (the template) from the actual XML data (the stats records), but in our case the template and data is stored in binary format for efficient parsing.
We suggest using the data store mechanism to store all statistics generated by your additions to PTLsim, since this system has built-in support for snapshots, checkpointing and structured easy to parse data (unlike simply writing values to a text file). It is further suggested that only raw values be saved, rather than doing computations in the simulator itself - leave the analysis to PTLstats after gathering the raw data. If some limited computations do need to be done before writing each statistics record, PTLsim will call the PTLsimMachine::update_stats() virtual method to allow your model a chance to do so before writing the counters.
After each node or counter is declared, one of several special C++-style ``//'' comments can be used to specify attributes for that node:
The node is at the root of the statistics tree (typically this only applies to the PTLsimStats structure itself)
All subnodes and counters under this node are assumed to total 100% of whatever quantity is being measured. This attribute tells PTLstats to print percentages next to the raw values in this subtree for easier viewing.
Specifies that the array of counters forms a histogram, i.e. each slot in the array represents the number of occurrences of one event out of a mutually exclusive set of events. The min parameter specifies the meaning of the first slot (array element 0), while the max parameter specifies the meaning of the last slot (array element arraysize-1). The stride parameter specifies how many events are counted into every slot (typically this is 1).
For example, let's say you want to measure the frequency distribution of the number of consumers of each instruction's result, where the maximum number of possible consumers is 256. You could specify this as:
W64 consumers[64+1]; // histo: 0, 256, 4This histogram has a logical range of 0 to 256, but is divided into 65 slots. Because the stride parameter is 4, any consumer counts from 0 to 3 increment slot 0, counts from 4 to 7 increment slot 1, and so on. When you update this counter array from inside the model, you should do so as follows:
stats.xxx.yyy.consumers[min(n / 4, 64)]++;
Specifies that the array of counters is a histogram of named, mutually exclusive events, rather than simply raw numbers (as with the histo attribute). The namearray must be the name of an array of arraysize strings, with one entry per event.
For example, let's say you want to measure the frequency distribution of uop types PTLsim is executing. If there are OPCLASS_COUNT, you could declare the following:
W64 opclass[OPCLASS_COUNT]; // label: opclass_namesIn some header file included by stats.h, you need to declare the actual array of slot labels:
static const char* opclass_names[OPCLASS_COUNT] = {''logic'', ''addsub'', ''addsubc'', ...};
PTLsim supports several options related to the statistics data store:
Specify the filename to which statistics data is written. In reality, two files are created: filename contains the template and snapshot index, while filename.data contains the raw data.
Creates a snapshot every N simulation cycles, numbered consecutively starting from 0. Without this option, only one snapshot, named final, is created at the end of the simulation run.
Creates a snapshot named name at the current point in the simulation. This can be used to asynchronously take a look at a simulation in progress. This option is only available in PTLsim/X.
The PTLstats program is used to analyze the statistics data store files produced by PTLsim. PTLstats will first extract the template stored in all data store files, and will then parse the statistics records into a flexible tree format that can be manipulated by the user. The following is an example of one node in the statistics tree, as printed by PTLstats:
store {
issue (total 68161716) {
[ 29.7% ] replay (total 20218780) {
[ 0.0% ] sfr_addr_not_ready = 0;
[ 16.8% ] sfr_data_and_data_to_store_not_ready = 3405878;
[ 11.8% ] sfr_data_not_ready = 2379338;
[ 23.4% ] sfr_addr_and_data_to_store_not_ready = 4740838;
[ 24.5% ] sfr_addr_and_data_not_ready = 4951888;
[ 23.4% ] sfr_addr_and_data_and_data_to_store_not_ready = 4740838;
}
[ 0.0% ] exception = 30429;
[ 7.9% ] ordering = 5404592;
[ 62.4% ] complete = 42507854;
[ 0.0% ] unaligned = 61;
}
Here is an example of a labeled histogram, produced using the ``// label: xxx'' attribute described in Section 8.1.2:
ValRange: 3209623 90432573
Total: 107190122
Thresh: 10720
[ 6.2% ] 0 6686971 1 (byte)
[ 6.4% ] 1 6860955 2 (word)
[ 84.4% ] 2 90432573 4 (dword)
[ 3.0% ] 3 3209623 8 (qword)
};
The basic syntax of the PTLstats command is ``ptlstats -options filename''. If no options are specified, PTLstats prints out the entire statistics tree from its root, relative to the final snapshot.
To select a specific snapshot, use the following option:
It may be desirable to examine the difference in statistics between two snapshots, for instance to subtract out the counters at the starting point of a given run or after a warmup period. The -subtract option provides this facility, for example:
To select a specific subtree of interest, use the syntax of the following example:
Subtrees or individual statistics can also be summed and averaged across many files, using the -collectsum or -collectaverage commands in place of -collect.
The -maxdepth option is useful for limiting the depth (in nodes) PTLstats will descend into the specified subtree. This is appropriate when you want to summarize certain classes of statistics printed as percentages of the whole, yet don't want a breakdown of every sub-statistic.
The -percent-of-toplevel option changes the way percentages are displayed. By default, percentages are calculated by dividing the total value of each node by the total of its immediate parent node. When -percent-of-toplevel is enabled, the divisor becomes the total of the entire subtree, possibly going back several levels (i.e. back to the highest level node marked with the summable attribute), rather than each node's immediate parent.
PTLstats provides a facility to easily generate R-row by C-column data tables from a set of R benchmarks run with C different sets of parameters. Tables can be output in a variety of formats, including plain text with tab or space delimiters (suitable for import into a spreadsheet), LATEX (for direct insertion into research reports) or HTML. To generate a table, use the following syntax:
Notice that you must create your own scripts, or manually run each benchmark and trial with the desired PTLsim options, plus ``-stats ptlsim.stats.trialname''. PTLstats will only report these results in table form; it will not actually run any benchmarks.
The -tabletype option specifies the data format of the table: ``text'' (plain text with space delimiters, suitable for import into a spreadsheet), ``latex'' (LATEX format, useful for directly inserting into research reports), or ``html'' (HTML format for web pages).
The ``-scale-relative-to-col N'' option forces PTLstats to compute the percentage of increase or decrease for each cell relative to the corresponding row in some other reference column N. This is useful when running a ``baseline'' case, to be displayed as a raw value (usually the cycle count, /final/summary/cycles) in column 0, while all other experimental cases are displayed as a percentage increase (fewer cycles, for a positive percentage) or percentage decrease (negative value) relative to this first column (N = 0).
In addition to creating tables, PTLstats can directly create colorful graphs (in Scalable Vector Graphics (SVG) format) from a set of benchmarks (specified by the -rows option) and trials of each benchmark (specified by the -cols option). For instance, to plot the total number of cycles taken over a set of benchmarks, each run under different PTLsim configurations, use the following example:
The graph's layout can be extensively customized using the options -title, -width, -height.
Inkscape (http://www.inkscape.org) is an excellent vector graphics system for editing and formatting SVG files generated by PTLstats.
Certain array nodes in the statistics tree can be tagged as ``histogram'' nodes by using the histo: or label: attributes, as described in Section 8.1.2. For instance, the ooocore/frontend/consumer-count node in the out-of-order core is a histogram node. PTLstats can directly create graphs (in Scalable Vector Graphics (SVG) format) for these special nodes, using the -histogram option: