Reports > StorNext Metrics
The StorNext Metrics reports provide performance data logging and visual reporting and graphing features for StorNext systems. The StorNext Metrics reports are a visual reporting tool. This tool combines comprehensive performance data logging with powerful visual reporting and analysis tools to help you identify potential problems and optimize system operations.
This topic contains the following sections:
Section | Description |
---|---|
Introduction to the StorNext Metrics Reports | Provides an overview of the features of StorNext Metrics reports. |
StorNext Metrics Navigation | Describes how to access and work with the Web-based user interface of StorNext Metrics reports. |
Reports and Graphs | Describes how to view and interpret the available performance reports. |
Work With Time Ranges | Describes how to move the time range backward and forward in time, and make the time range longer or shorter |
Work With Graphs | Describes how to work with the various types of graphs. |
Interpret Performance Data | Describes how to interpret a particular type of performance data. |
The StorNext Metrics reports are a visual reporting tool. This tool combines comprehensive performance data logging with powerful visual reporting and analysis tools to help you identify potential problems and optimize system operation.
With the StorNext Metrics reports, you can view an array of performance statistics for a StorNext system and see how those statistics change over time. This lets you identify trends or determine when a problem began.
By showing you how various operations affect performance, the StorNext Metrics reports also helps you optimize the network ecosystem and business procedures for data management policies, file system disk usage.
The StorNext Metrics reports continually works in the background to log performance data. To view logged data, use the StorNext Metrics graphical reports. Reports are available on demand through a Web-based interface. You can check up-to-the-minute system status or view data for any time period since data logging began.
The StorNext Metrics reports let you view and work with a wealth of performance and system statistics, such as Ethernet I/O, Fibre Channel I/O, CPU load, File System (FS), and Memory. Each report includes two or more graphs. Use the report tools to zoom in on a graph to see just the time period you want to see, or zoom out to see data for a longer time period.
No matter what time period you select, all of the graphs in the report stay in sync. In addition, the StorNext Metrics reports maintain the current time period when you select a new report. This lets you compare performance data between graphs in the same report or between different reports. For example, you can see how CPU load is affected during space reclamation activities.
StorNext Metrics reports maintain a maximum of 30 days of metrics data in the database. The maximum number of days of metrics data you can view is the last 3 days (click the Last 3 Days radio button preset time range). To view metrics data for the last 4 to 30 days, select a specific date using the Select Time Period calendar widget.
StorNext Metrics reports’ historical record lets you compare current performance to past performance. It also lets you see the effect of any recent changes to system and network configuration or business processes.
StorNext Metrics reports record performance data in the logging database. The database resides on the StorNext system where StorNext Metrics reports are is running.
To access the StorNext Metrics reports, follow the procedure below:
-
From the StorNext GUI, on the Reports menu, click StorNext Metrics. The StorNext Metrics reports page displays. The user interface is also referred to as the report window.
The report window displays the performance graphs for the currently selected report. When you first access the StorNext Metrics reports, the Ethernet I/O report displays.
StorNext Metrics reports maintain the currently selected time range when you select a new report. For example, if you are currently viewing the most recent day of memory usage for the Memory report, StorNext Metrics reports displays data for that same time range when you select a new report. This makes it easy to compare different performance statistics for the same time range.
Note: For more information about time ranges, see Work With Time Ranges.
This section provides information to help you interpret the reports available in StorNext Metrics.
Each report available in StorNext Metrics is made up of two or more graphs. Some graphs appear in more than one report.
The table below lists the graphs included in each report. The reports are designated as (L) for layered graphs or (S) for stacked graphs. This distinction does not apply to graphs that report only a single variable. For information about interpreting each report, see Report Descriptions.
Report | Graphs |
---|---|
Ethernet I/O Report |
Ethernet Activity (S) ethn Activity |
Fibre Channel I/O Report |
Fibre Channel Activity (S)FCn Activity |
CPU load Report |
CPU Load Average CPU stats in % (S) |
Memory Report |
Memory usage (always base 1024) (S) Swap usage (always base 1024) (S) |
File System Reports |
Space Inodes Connections CPU Load Memory Usage |
This section describes the graphs included in the following reports available in StorNext Metrics.
The Ethernet I/O report displays detailed information about the amount of data passing through the Ethernet ports in the system. The report contains the following graphs:
The Ethernet Activity (Aggregate Activity) graph displays the amount of data passing through all of the Ethernet ports in the system.
Use the Ethernet Activity graph to monitor writes to and reads from the system using the Ethernet ports. The graph displays each port in a different color.
The symmetry between the four ports indicate the Ethernet ports are bonded (not segmented) and traffic is balanced across the ports.
-
Write activity (above the zero line) indicates web management activity, heartbeat traffic, and metadata traffic.
-
Read activity (below the zero line) indicates web management activity, heartbeat traffic, or metadata traffic.
The ethn Activity (Comparative Activity) graph displays the amount of data passing through Ethernet port n. A graph appears for each Ethernet port in the system, for example, eth0, eth1, eth2, and eth3.
Use the ethn Activity graph to monitor writes to and reads from the system using Ethernet port n.
-
Write activity (above the zero line) indicates web management activity, heartbeat traffic, and metadata traffic.
-
Read activity (below the zero line) indicates web management activity, heartbeat traffic, or metadata traffic.
View the Ethernet I/O report when you need to monitor writes to and reads from the system using the Ethernet ports.
To access the Ethernet I/O report, select Reports > StorNext Metrics > Ethernet I/O.
The Fibre Channel I/O report displays detailed information about the amount of data passing through the Fibre Channel ports in the system. The report contains the following graphs:
The Fibre Channel Activity (Aggregate Activity) graph displays the amount of data passing through all of the Fibre Channel ports in the system.
Use the Fibre Channel Activity graph to monitor writes to and reads from the system using the Fibre Channel ports.
-
The graph displays each port in a different color.
-
Fibre Channel write activity (above the zero line) occurs during backups.
-
A regular backup schedule results in repeating patterns.
-
Symmetrical read and write activity (that is, mirrored patterns above and below the zero line) indicate Tertiary Storage Manager (TSM) tape reclamation.
The FCn Activity graph displays the amount of data passing through Fibre Channel port n. A graph appears for each Fibre Channel port in the system, for example, port 4, port 5, and so on.
Use the FCn Activity graph to monitor writes to and reads from the system using Fibre Channel port n.
- Fibre Channel write activity (above the zero line) occurs during backups.
- A regular backup schedule results in repeating patterns.
The CPU Load report displays information about the usage of CPU resources in the system. The report contains the following graphs:
To access the CPU load report, select Reports > StorNext Metrics > CPU load.
The CPU Load Average graph displays the one minute load average for the system.
Use the CPU Load Average graph to determine if the system has adequate CPU resources.
-
The load average represents the average number of processes, in a one minute time period, that were running on a CPU or that were waiting to run on a CPU.
-
A load average higher than the number of CPU cores in the system indicates that the system is CPU limited.
For example, StorNext Metadata Appliances have four CPUs. In this case, a load average of greater than four means that some processes had to wait for an available CPU before running. In contrast, a load average of less than four means no processes had to wait for a CPU.
The CPU stats in % graph displays the relative CPU usage for seven categories of processes (see the table below).
Process Category | Description |
---|---|
iowait |
The CPU is waiting for an I/O device to respond (for example, the system is waiting on a disk). |
irq |
The CPU is handling an interrupt request related to I/O (for example, network, Fibre Channel, disk, keyboard, or serial port activity). |
softirq |
The CPU is handling a high level I/O task (for example, timer interrupts or packets in the TCP/IP stack). |
system |
The CPU is handling a kernel process (for example, file system operations related to the StorNext or blocklet file systems). |
nice |
The CPU is handling processes that have lower priority (for example, background processes). |
user |
The CPU is handling processes that are not owned by the kernel. |
idle |
The CPU is not handling one of the other process categories. |
Use the CPU stats in % graph to see how CPU resources are allocated among different categories of processes. The amount of CPU activity consumed by each category of process is expressed as a percentage. The percentages (including the value for idle, which is not shown in the graph) total to 100%.
If a system has a high CPU load average (see CPU Load Average on page 399), then consider the following guidelines:
-
A high percentage of system and user activity indicates the system is CPU limited. Add more CPUs to improve system performance.
-
A high percentage of iowait activity indicates the system is I/O limited. Add more disks or arrays to improve system performance.
The Memory report displays information about memory usage. You can view this report to make sure the cache settings are configured to maximize system performance. The report contains the following graphs:
The Memory usage graph monitors the amount of physical memory (RAM) used during system operation.
Note: For newly configured systems, you will see mostly free memory (light green). As the system increases the number of file systems and clients, monitor this graph often to make sure that there is as little free memory available as possible. This means the cache settings are configured to maximize performance.
In the Memory usage graph:
-
Values are in GB. The graph always displays values in base 1024.
-
For standard memory (4 KB) pages, the graph displays the amount of memory that is free and used. The graph also displays the amount of memory used for caching . Memory used for caching can be easily freed up, therefore it usually can be treated as available even though it is not free.
-
For huge memory (2 MB) pages, the graph displays the amount of memory that is free and used.
The Swap usage graph monitors the amount of virtual memory used during system operation.
In the Swap usage graph:
-
Values are in GB. The graph always displays values in base 1024.
-
The graph displays the amount of the disk swap file that is free and used.
Note: In the examples, some of the data in the Memory graphs may not be accurate.
To access the Memory report, select Reports > StorNext Metrics > Memory.
To access the File System reports, select Reports > StorNext Metrics > Filesystem.The File System reports provide file system statistics. Each File System report displays a single file system server process running on the StorNext MDC Node. The File System reports display detailed statistics about the configured file systems in the StorNext MDC. Each File System report contains the following graphs:
The File System Space graph displays the number of bytes of data space currently in use or available for use by file system clients. This graph displays you exactly when file system clients started to quickly fill space in the file system.
This graph is a stacked graph, meaning the graph superimposes data for two or more variables on top of each another. A different color is assigned to each variable, so you can see how the values for each variable differ over time.
-
Bytes Free: Displays the number of bytes of data space available for use by StorNext file system clients. As the level of free data space gets low, expanding the data space for the given file system should be considered.
-
Bytes Used: Displays the number of bytes of data space currently in use by StorNext file system clients.
The File System inodes graph displays the number of inodes that are currently available or are in use by the selected file system.
A large number of inodes in use in the file system indicates that either the file system contains a lot of files or that it contains a lot of fragmented files.
-
Inodes free: Displays the number of free inodes available. These inodes are available for use in allocating files in the selected StorNext file system. This value should not reach zero unless all of the StorNext metadata space has been consumed. Inodes are allocated dynamically.
-
Inodes used: Displays the number of inodes allocated. These inodes are in use by the StorNext file system. This graph will show the growth in the number of files used in the StorNext file system.
The File System connections graph displays the number of StorNext DLC and SAN clients that are connected to the selected file system.
Use this graph to view the history of your client connections. The number of connections will remain steady if the client computers are left running all the time.
-
Proxy Connections: Displays the number of StorNext DLC clients connected to this StorNext file system.
-
Connections: Displays the number of StorNext SAN clients connected to this StorNext file system.
The File System CPU Load graph displays the percentage of CPU resources consumed by the File System process for the selected file system.
The File System Memory Usage graph displays the total amount of physical and virtual memory in use by the selected file system.
If the file system’s performance degrades, then check this graph for an increase of physical memory (size) in use. This could indicate that runaway processes are doing lots of busy work (thrashing) or that caching is set too high.
A rapid increase in physical memory usage (for example, the file system was using around 200 MB of memory suddenly shot up to 2 GB) may indicate a client or caching issue. If this occurs, you should check the cvlogs to see what happened at the time of the increase.
-
Size: Indicates the total amount of physical memory in use by the by the file system. High levels of memory consumption by the StorNext FS process could lead to performance if virtual memory swapping once contention for memory resources is seen.
-
Vsize: Indicates the total amount of virtual memory in use by the file system. The virtual memory size will display the total process footprint for the file system.
A time range is like a window through which you view performance data. Each report displays performance data for the time range you choose. All graphs in a report display data for the same time range. By default, StorNext Metrics reports display data for the most recent hour of logging. You can move the time range backward and forward in time, and you can make the time range longer or shorter. When you change the time range, the report automatically adjusts the resolution of performance data. For example, the resolution is finer (more granular) for shorter time ranges and is coarser (less granular) for longer time ranges.
Note: No matter how long the time range is, the report scales all graphs in the report so that the time range uses the entire width of each graph.
To view performance data for a different time range, use one of the following methods:
To move the time range forward or backward in time, use the selection handles below a graph. The timeline displays the current time range used for the report.
-
To move the time range backward or forward in time, drag the timeline to the left or right.
-
To make the time range longer or shorter, drag the left and right selection handles.
To quickly display performance data for a different time range, use the time range presets on the timeline.
To view logged data using a preset time range, perform one of the following:
-
Click Day, and then select the day from the calendar widget, or enter a date.
-
Click Last 3 Days.
-
Click Last 24 Hours.
- Click Last Hour.
When you apply a preset, StorNext Metrics re-sizes the time range while maintaining the center of the time range.
Move the time range forward or backward in time using the navigation buttons on the left or right of the button bar. StorNext Metrics reports shift the time range while maintaining the length of the time range.
Button | Description |
---|---|
< or > |
Moves the time range back or forward an amount equal to one quarter of the current time range. |
Note: The graph displays the starting date and time and the ending date and time of the current time range, as well as the total length of the time range.
In StorNext Metrics, each report is made up of one or more graphs. Each graph displays a particular type of performance data for the current time range. For example, the Memory report includes the following graphs: Memory Usage, and Swap Usage. The horizontal axis of each graph represents time and displays the current time range. The vertical axis varies depending on the graph. It is often a capacity or data amount, but can also be a calculated value such as a ratio, average, or percentage. See the following sections for more information about graphs:
A white gap in a graph indicates an absence of logging data for a period of time. This can occur for the following reasons:
-
A system reboot occurred.
-
No StorNext Metrics report logging took place because the system was busy.
-
StorNext Metrics reports logging was turned off.
StorNext Metrics reports often display data for multiple variables on the same graph. This lets you see the interaction between different variables.
StorNext Metrics reports use two different methods for placing multiple variables on the same graph:
StorNext Metrics reports use layered graphs to compare related variables. A layered graph superimposes data for two or more variables on top of one another. StorNext Metrics reports assign a different color to each variable, so you can see how the values for each variable differ over time.
For example, in the file system Space graph, StorNext Metrics reports display a separate value line for the variables Bytes used and Bytes free.
Note: StorNext Metrics reports always display the smaller variable in front of the larger variable. Because of this, shifts in the color pattern in a graph can occur if the variable that was smaller becomes larger at some point in time.
StorNext Metrics reports use stacked graphs to display aggregate performance. A stacked graph adds together values for two or more variables to arrive at a total value. StorNext Metrics reports assign a different color to each variable, so you can see the contribution that each variable makes to the total.
For example, in the Ethernet Activity graph, values for each Ethernet port are added together to reach a total value for each point in the time range.
StorNext Metrics reports use graphs with a zero line to show when the StorNext MDC is being written to or being read from.
-
Positive values (above the line) represent data being written to the StorNext MDC.
-
Negative values (below the line) represent data being read from the StorNext system.
By using a zero line, StorNext Metrics reports show data reads and writes on the same graph, for example, on the Ethernet Activity graph.
The power of StorNext Metrics reports is that it lets you compare different types of performance data for the same time range. This lets you see patterns and trends and helps you identify relationships between events. Keep in mind the following general concepts as you work with graphs:
When you view a report, try to correlate information in one graph with information in the other graphs. Remember that all graphs in a report display the same time range and always remain in sync. That means an event that happens in the center of one graph can be correlated with an event that happens in the center of another graph in the same (or in a different) report. In other words, if you can draw a straight vertical line between events in two graphs, then the events happened at the same time.
As you work in StorNext Metrics reports, look for interactions between events in different graphs. While correlation is not the same as causation, if you consistently see that events in one graph happen at the same time as events in another graph, there is a strong possibility that the two types of events are related.
StorNext Metrics reports use aggregation to convert the resolution of the database to the resolution of the graph. This means that, in many cases, each pixel in the graph is an aggregate of multiple data points in the database. Depending on how many data points are aggregated to create each pixel in the graph, the resulting value can change.
The underlying data does not change. The difference in amplitude is due to the different number of data points StorNext Metrics reports aggregate when calculating the value for each pixel in the graph. Be aware of this effect as you work with graphs and time ranges in StorNext Metrics reports.
Note: StorNext Metrics reports use the finest resolution of data available in the database. Finer-grained data is available for more recent time ranges as opposed to time ranges further in the past. This affects the number of data points StorNext Metrics reports aggregate when displaying a graph, and in turn can affect amplitude.