MDC Node Tools and Diagnostics

SNFS Logs

The SNFS logging can be found in three different areas. 

The cvlogs

The cvlogs contain messages from the fsm processes. They can be found at the following location.

 

Linux/Unix: /usr/cvfs/data/<fsname>/log/cvlog*

Windows: C:\Program Files\StorNext File System\data\<fsname>\log\cvlog*

 

Look in the cvlogs for error messages or disk latency messages. High disk latency can affect the overall performance of SNFS.

The nssdbg.log

The nssdbg.log file contains messages from the fsmpm process. This file can be found in the following location.

 

 

Linux/Unix: /usr/cvfs/debug/nssdbg.out

Windows: C:\Program Files\StorNext File System\debug\nssdbg.out

 

 

  1. disk discovery lists - when the fsmpm scans disks, a listing of disks will be printed. If fsm processes are not starting or clients are not mounting, verify all disks are seen by the fsmpm and that all labels are as expected.
  2. fsm start and registration messages
  3. Activation voting activity

 

QuStats

 

The qustats are are measuring overall metadata statistics, physical I/O, VOP statistics and client specific VOP statistics.

The overall metadata statistics include journal, thread (not currently available) and cache information. All of these can be affected by changing the configuration parameters for a file system. Examples are increasing the journal size, increasing the thread pool (not currently available) and increasing cache sizes.

The physical I/O statistics show number and speed of disk I/O. Poor numbers can indicate hardware problems or under capacity.

The VOP statistics show what clients are doing, which can show where work flow changes may improve performance.

The Client specific VOP statistics can show which clients are generating the VOP requests.

 

Key Stat Tables

 

 

    There are tools that can be used to parse Qustat data. Detail can be found in the Qustat information wiki

Storage Manager Logs

<please add details here>

System logs

 

Linux: /var/log/messages

Windows: Look at the Event Viewer

 

 

Watch for both system related errors and StorNext related errors. For example;

  1. Storage hardware errors like I/O errors and SCSI sense messages indicate that the HBA, SAN and Storage should be inspected to ensure that everything is stable. Just because there are no obvious errors in the array logs or switch logs doens't mean everything is Ok.
  2. Networking error such as network disconnects or timeouts may indicate that there is some connectivity problems. Network settings should be checked. If bonded NICs are used verifiy that all the neccessary connectivity is present and the switch is capable and configured for the bonding. NICs, Switches and cables should be verified.
  3. Watch for indications of low memory or system disk events

 

 

Notes

Do we have a tool to record the metadata requests and operations from StorNext clients at MDC server side? We know this will affect the performance a little, but it can help the end user to audit their system operations. SNFS file system is a sharing file system. Some customrs complained they found some files lost, but they didn't know which SAN clients did this deleting operations. 

Note by Harvey Zeng on 05/22/2013 05:50 PM

 I wrote a simple tool that looks for journal waits that does nothing more than tell you that they happened or not.  Here's an output sample.   Maybe this is the amount of info desired?   Hints to go look in depth or not?  It does assume the impact of journal waits is understood. 

# journal_waits.pl --qdir=/usr/cvfs/qustats/FSM/lmccvfsck/lmc-vg.mdh.quantum.com/

 

Opened /usr/cvfs/qustats/FSM/lmccvfsck/lmc-vg.mdh.quantum.com/

 

 81 files inspected with 0 containing journal waits

 

Is

 

 

Note by Laurie Costello on 05/22/2013 05:35 PM

Can we provide a tool to filiter these logs to find any performance issue or some problems?  This tool can be helped to analyze these logs.

Note by Harvey Zeng on 04/24/2013 10:00 AM


This page was generated by the BrainKeeper Enterprise Wiki, © 2018