Next: Benchmarking Techniques Up: PTLsim User's Guide Previous: PTLsim Support Subsystems Contents

Subsections

Statistics Collection and Analysis

PTLsim Statistics Data Store

Introduction

PTLsim maintains a huge number of statistical counters and data points during the simulation process, and can optionally save this data to a file by using the ``-stats filename'' configuration option. The data store is a binary file format used to efficiently capture large quantities of statistical information for later analysis. This file format supports storing multiple regular or triggered snapshots of all counters. Snapshots can be subtracted, averaged and extensively manipulated, as will be described later on.

PTLsim makes it trivial to add new performance counters to the statistics data tree. All counters are defined in stats.h as a tree of nested structures; the top-level PTLsimStats structure is mapped to the global variable stats, so counters can be directly updated from within the code by simple increments, e.g. stats.xxx.yyy.zzz.countername++. Every node in the tree can be either a struct, W64 (64-bit integer), double (floating point) or char (string) type; arrays of these types are also supported. In addition, various attributes, described below, can be attached to each node or counter to specify more complex semantics, including histograms, labeled arrays, summable nodes and so on.

PTLsim comes with a special script, dstbuild (``data store template builder'') that parses stats.h and constructs a binary representation (a ``template'') describing the structure; this template data is then compiled into PTLsim. Every time PTLsim creates a statistics file, it first writes this template, followed by the raw PTLsimStats records and an index of those records by name. In this way, the complete data store tree can be reconstructed at a later time even if the original stats.h or PTLsim version that created the file is unavailable. This scheme is analogous to the separation of XML schemas (the template) from the actual XML data (the stats records), but in our case the template and data is stored in binary format for efficient parsing.

We suggest using the data store mechanism to store all statistics generated by your additions to PTLsim, since this system has built-in support for snapshots, checkpointing and structured easy to parse data (unlike simply writing values to a text file). It is further suggested that only raw values be saved, rather than doing computations in the simulator itself - leave the analysis to PTLstats after gathering the raw data. If some limited computations do need to be done before writing each statistics record, PTLsim will call the PTLsimMachine::update_stats() virtual method to allow your model a chance to do so before writing the counters.

Node Attributes

After each node or counter is declared, one of several special C++-style ``//'' comments can be used to specify attributes for that node:

struct Name { // rootnode:

The node is at the root of the statistics tree (typically this only applies to the PTLsimStats structure itself)
struct Name { // node: summable

All subnodes and counters under this node are assumed to total 100% of whatever quantity is being measured. This attribute tells PTLstats to print percentages next to the raw values in this subtree for easier viewing.
W64 name[arraysize]; // histo: min,max, stride

Specifies that the array of counters forms a histogram, i.e. each slot in the array represents the number of occurrences of one event out of a mutually exclusive set of events. The min parameter specifies the meaning of the first slot (array element 0), while the max parameter specifies the meaning of the last slot (array element arraysize-1). The stride parameter specifies how many events are counted into every slot (typically this is 1).

For example, let's say you want to measure the frequency distribution of the number of consumers of each instruction's result, where the maximum number of possible consumers is 256. You could specify this as:

W64 consumers[64+1]; // histo: 0, 256, 4

This histogram has a logical range of 0 to 256, but is divided into 65 slots. Because the stride parameter is 4, any consumer counts from 0 to 3 increment slot 0, counts from 4 to 7 increment slot 1, and so on. When you update this counter array from inside the model, you should do so as follows:

stats.xxx.yyy.consumers[min(n / 4, 64)]++;
W64 name[arraysize]; // label: namearray

Specifies that the array of counters is a histogram of named, mutually exclusive events, rather than simply raw numbers (as with the histo attribute). The namearray must be the name of an array of arraysize strings, with one entry per event.

For example, let's say you want to measure the frequency distribution of uop types PTLsim is executing. If there are OPCLASS_COUNT, you could declare the following:

W64 opclass[OPCLASS_COUNT]; // label: opclass_names

In some header file included by stats.h, you need to declare the actual array of slot labels:

static const char* opclass_names[OPCLASS_COUNT] = {''logic'', ''addsub'', ''addsubc'', ...};

Configuration Options

PTLsim supports several options related to the statistics data store:

-stats filename

Specify the filename to which statistics data is written. In reality, two files are created: filename contains the template and snapshot index, while filename.data contains the raw data.
-snapshot-cycles N

Creates a snapshot every N simulation cycles, numbered consecutively starting from 0. Without this option, only one snapshot, named final, is created at the end of the simulation run.
-snapshot-now name

Creates a snapshot named name at the current point in the simulation. This can be used to asynchronously take a look at a simulation in progress. This option is only available in PTLsim/X.

PTLstats: Statistics Analysis and Graphing Tools

The PTLstats program is used to analyze the statistics data store files produced by PTLsim. PTLstats will first extract the template stored in all data store files, and will then parse the statistics records into a flexible tree format that can be manipulated by the user. The following is an example of one node in the statistics tree, as printed by PTLstats:

dcache {

store {

issue (total 68161716) {

[ 29.7% ] replay (total 20218780) {

[ 0.0% ] sfr_addr_not_ready = 0;

[ 16.8% ] sfr_data_and_data_to_store_not_ready = 3405878;

[ 11.8% ] sfr_data_not_ready = 2379338;

[ 23.4% ] sfr_addr_and_data_to_store_not_ready = 4740838;

[ 24.5% ] sfr_addr_and_data_not_ready = 4951888;

[ 23.4% ] sfr_addr_and_data_and_data_to_store_not_ready = 4740838;

}

[ 0.0% ] exception = 30429;

[ 7.9% ] ordering = 5404592;

[ 62.4% ] complete = 42507854;

[ 0.0% ] unaligned = 61;

}

Notice how PTLstats will automatically sum up all entries in certain branches of the tree to provide the user with a breakdown by percentages of the total for that subtree in addition to the raw values. This is achieved using the ``// node: summable'' attribute as described in Section 8.1.2.

Here is an example of a labeled histogram, produced using the ``// label: xxx'' attribute described in Section 8.1.2:

size[4] = {

ValRange: 3209623 90432573

Total: 107190122

Thresh: 10720

[ 6.2% ] 0 6686971 1 (byte)

[ 6.4% ] 1 6860955 2 (word)

[ 84.4% ] 2 90432573 4 (dword)

[ 3.0% ] 3 3209623 8 (qword)

};

Snapshot Selection

The basic syntax of the PTLstats command is ``ptlstats -options filename''. If no options are specified, PTLstats prints out the entire statistics tree from its root, relative to the final snapshot.

To select a specific snapshot, use the following option:

: ptlstats -snapshot name-or-number ...

Snapshots may be specified by name or number.

It may be desirable to examine the difference in statistics between two snapshots, for instance to subtract out the counters at the starting point of a given run or after a warmup period. The -subtract option provides this facility, for example:

: ptlstats -snapshot final -subtract startpoint ...

Working with Statistics Trees: Collection, Averaging and Summing

To select a specific subtree of interest, use the syntax of the following example:

: ptlstats -snapshot final -collect /ooocore/dcache/load example1.stats example2.stats ...

This will print out the subtree /ooocore/dcache/load in the snapshot named final (the default snapshot) for each of the named statistics files example1.stats, example2.stats and so on. Multiple files are generally used to examine a specific subnode across several benchmarks.

Subtrees or individual statistics can also be summed and averaged across many files, using the -collectsum or -collectaverage commands in place of -collect.

Traversal and Printing Options

The -maxdepth option is useful for limiting the depth (in nodes) PTLstats will descend into the specified subtree. This is appropriate when you want to summarize certain classes of statistics printed as percentages of the whole, yet don't want a breakdown of every sub-statistic.

The -percent-of-toplevel option changes the way percentages are displayed. By default, percentages are calculated by dividing the total value of each node by the total of its immediate parent node. When -percent-of-toplevel is enabled, the divisor becomes the total of the entire subtree, possibly going back several levels (i.e. back to the highest level node marked with the summable attribute), rather than each node's immediate parent.

Table Generation

PTLstats provides a facility to easily generate R-row by C-column data tables from a set of R benchmarks run with C different sets of parameters. Tables can be output in a variety of formats, including plain text with tab or space delimiters (suitable for import into a spreadsheet), L^ATEX (for direct insertion into research reports) or HTML. To generate a table, use the following syntax:

: ptlstats -table /final/summary/cycles -rows gzip,gcc,perlbmk,mesa -cols small,large,huge -table-pattern "%row/ptlsim.stats.%col"

In this example, the benchmarks (``gzip'', ``gcc'', ``perlbmk'', ``mesa'') will form the rows of the table, while three trials done for each benchmark (``small'', ``large'', ``huge'') will be listed in the columns. The row and column names will be combined using the pattern ``%row/ptlsim.stats.%col`` to generate statistics data store filenames like ``gzip/ptlsim.stats.small''. PTLstats will then load the data store for each benchmark and trial combination to create the table.

Notice that you must create your own scripts, or manually run each benchmark and trial with the desired PTLsim options, plus ``-stats ptlsim.stats.trialname''. PTLstats will only report these results in table form; it will not actually run any benchmarks.

The -tabletype option specifies the data format of the table: ``text'' (plain text with space delimiters, suitable for import into a spreadsheet), ``latex'' (L^ATEX format, useful for directly inserting into research reports), or ``html'' (HTML format for web pages).

The ``-scale-relative-to-col N'' option forces PTLstats to compute the percentage of increase or decrease for each cell relative to the corresponding row in some other reference column N. This is useful when running a ``baseline'' case, to be displayed as a raw value (usually the cycle count, /final/summary/cycles) in column 0, while all other experimental cases are displayed as a percentage increase (fewer cycles, for a positive percentage) or percentage decrease (negative value) relative to this first column (N = 0).

Bargraph Generation

In addition to creating tables, PTLstats can directly create colorful graphs (in Scalable Vector Graphics (SVG) format) from a set of benchmarks (specified by the -rows option) and trials of each benchmark (specified by the -cols option). For instance, to plot the total number of cycles taken over a set of benchmarks, each run under different PTLsim configurations, use the following example:

: ptlstats -bargraph /final/summary/cycles -rows gzip,gcc,perlbmk,mesa -cols small,large,huge -table-pattern "%row/ptlsim.stats.%col"

In this case, groups of three bars (for the trials ``small'', ``large'', ``huge'') appear for each benchmark.

The graph's layout can be extensively customized using the options -title, -width, -height.

Inkscape (http://www.inkscape.org) is an excellent vector graphics system for editing and formatting SVG files generated by PTLstats.

Histogram Generation

Certain array nodes in the statistics tree can be tagged as ``histogram'' nodes by using the histo: or label: attributes, as described in Section 8.1.2. For instance, the ooocore/frontend/consumer-count node in the out-of-order core is a histogram node. PTLstats can directly create graphs (in Scalable Vector Graphics (SVG) format) for these special nodes, using the -histogram option:

: ptlstats -histogram /ooocore/frontend/consumer-count > example.svg

The histogram's layout can be extensively customized using the options -title, -width, -height. In addition, the -percentile option is useful for controlling the displayed data range by excluding data under the Nth percentile. The -logscale and -logk options can be used to apply a log scale (instead of a linear scale) to the histogram bars. The syntax of these options can be obtained by running ptlstats without arguments.

Next: Benchmarking Techniques Up: PTLsim User's Guide Previous: PTLsim Support Subsystems Contents

Matt T Yourst 2007-09-26