HA Resets
After a suspected HA Reset, the first place to look is the /usr/cvfs/debug/smithlog
file, which contains one-line time-stamped descriptions of probable causes for the reset.
There are three methods for producing an HA Reset:
- Expiration of an HA Monitor timer.
- Exit of the active HaShared FSM while the shared file system is mounted on the active MDC.
- Invocation of the command
snhamgr force smith
by a script or manually by an administrator. The smithlog file is written by the fsmpm process, so there would not be an entry in the file when an fsmpm exit results in an HA Reset.
Caution: It is not recommended to use the force smith
command to administratively failover a system in a production environment. The preferred method to gracefully failover the primary system to its secondary node is to simply stop CVFS and restart it after the secondary node has become primary. For example, on the node that is primary run:
# service cvfs stop
Wait for the secondary to become primary, then run:
# service cvfs start

The first method of an HA Reset is explained by the following description of the FSM monitoring algorithm (patent pending). The terms usurp and usurpation refer to the process of taking control of a file system, either with or without contention. It involves the branding of the arbitration block on the metadata disk to take control, and then the timed rebranding of the block to maintain control. The HA Monitor algorithm places an upper bound on the timing of the ARB branding protocol to prevent two FSMs from simultaneously attempting to control the metadata, even for an instant.
- When an activating HaUnmanaged or HaShared FSM usurps the ARB, create a five-second timer that resets the computer if it expires
- Wait five seconds plus a small delta before completing usurpation
- Immediately after every ARB Brand update (.5 second period), reset the timer
- Delete the timer when the FSM exits
When there is a SAN, LUN, or FSM process failure that delays updates of the ARB, the HA Monitor timer can run out. When it is less than one second from expiring, a one-line message describing this is written to the /usr/cvfs/debug/smithlog
file.
If SAN or LUN delays are suspected of occurring with regular frequency, the following test can be run. This will significantly impact performance.
- Increase the timer value (up to 999 seconds) by creating the
/usr/cvfs/config/ha_smith_interval
file on each MDC with only this line: 'ha_smith_interval=<integer>
'. This will allow the delays to run their course without incurring a reset. The value must match on both MDCs. - Turn on debugging traces with '
cvdbset :ha'
- Display debugging traces with '
cvdb -g -C -D 500
' - Look for the lines like this example '
HAmonCheck PID #### FS "testfs" status delay = 1
' - When the value grows is more than 1, there are abnormal delays occurring. When a standby FSM is running and the LAN is working, the negotiated timer resets should limit the growth of this value to four. When the value reaches two times the
ha_smith_interval
(default of 5 x 2 = 10), an HA Reset occurs. - Turn off tracing with '
cvdbset - all
'

The second method of HA Reset can occur on shutdown of CVFS if there is an unkillable process or delayed process exit under the HaShared file system mount point. This will keep the file system from being unmounted. The smithlog entry indicates when this has happened, but does not identify the process.

The third method of HA Reset is the most common. It occurs when the snactivated script for the HaShared FSM experiences an error during startup. The current implementation invokes the 'snhamgr force smith
' command to allow the peer MDC an opportunity to start up StorNext if it can. A similar strategy was used in previous releases. In this release, the failure to start will cause the /usr/cvfs/install/.ha_idle_failed_startup
touch file to be created, and this will prevent startup of CVFS on this MDC until the file is erased with the 'snhamgr clear' command.

The snhamgr rules for mode pairings are easier to understand by following a BAAB strategy for transitioning into and out of config or single mode. In this strategy, B stands for the redundant node, and A stands for the node to be placed into config or single mode. Enter the desired cluster state by transitioning B's mode first, then A's. Reverse this when exiting the cluster state by transitioning A's mode, then B's.
For the configuration-session example, place B in locked mode, then place A in config mode to start a configuration session. At the end of the session, place A in default mode, then place B in default mode.
For the single-server cluster example, shut down Linux and power off B, then designate it peerdown with the 'snhamgr peerdown' command on A, then place A in single mode. At the end of the session, place A in default mode, then designate B as up with the 'snhamgr peerup' command on A, then power on B.