How to Gracefully Fail Over a StorNext HA Node |
How to Gracefully Fail Over a StorNext HA Node
Care should be taken when trying to fail over from the current MDC in a HA pair, particularly in a StorNext appliance. In particular the use of SMITH is discouraged to avoid the possibility of filesystem or SNSM database corruption.
Note that failing parts of the configuration, eg simply stopping the HA filesystem, will result in a SMITH and should be avoided.
Preparation
First check that the alternate node is functioning and ready to take over by running snhamgr on the node you intend to fail over from :
# snhamgr status
LocalMode=default
LocalStatus=primary
RemoteMode=default
RemoteStatus=running
Also ensure that any resources that the alternate node requires are available and operational. (EG check disks with cvlabel and tape libraries and drives with fs_scsi).
You may also wish to open shell windows on both nodes and tail the system logs to monitor the failover in real time.
Failover
To initiate the failover simply stop the StorNext filesystem service :
# service cvfs stop
On completion of the stop the snhamgr should show the alternate system as the primary and the current node as stopped :
# snhamgr status
LocalMode=default
LocalStatus=stopped
RemoteMode=default
RemoteStatus=primary
Be sure to restart the service once the failover has completed to ensure that the local node is once again available to take over in the event of a failure of the new primary.
Example of Filesystem Service Successfully Stopping and Starting
# service cvfs stop
Initiating stop of StorNext SNAPI component
SNAPI software stopped.
Initiating stop of StorNext TSM component
FS0285 Tertiary Manager terminate requested.
FS0279 Tertiary Manager software successfully terminated.
FS0000 01 0001348744 /usr/adic/TSM/exec/fsconfig completed: Command Successful.
Initiating stop of StorNext MSM component
Media Manager Version 5.0.1 for Linux (Kernel:2618 OS:RHEL5) -- Copyright (C) 1992-2014 Quantum Corp.
Initiating the Media Manager shutdown
Setup environment variables ok
Shutting down the Media Manager system processes ... Done
System processes shut down ok
Shutting down the Media Manager servers ... Done
Servers shut down ok
Shutting down the Media Manager process server ... Done
Process server shut down ok
The Media Manager shutdown completed
Initiating stop of StorNext PSE component
Initiating stop of StorNext SRVCLOG component
Stopping...
Stopping sla with pid: 2211
Stopping ala with pid: 2221
Initiating stop of StorNext mysql component
Stopping mysqld
mysqld stopped
Initiating stop of StorNext DSM component
Stopping blockpool succeeded [ OK ]
Terminating snpolicyd, this may take up to 300 seconds
Stopping snpolicyd succeeded [ OK ]
Unmounting SNFS filesystems [ OK ]
Stopping SNFS Daemons
Disabling vips
Running '/sbin/ifconfig bond0:ha down'
Stopping SNFS PortMapper
Waiting for FSMs to finish..
SNFS Stop [ OK ]
#
# service cvfs start
Initiating start of StorNext DSM component
Checking maintenance license...
- The maintenance license status is: Good [ OK ]
Initializing StorNext Filesystem (SNFS)
Loading SNFS modules
net.core.rmem_max = 1048576
Multipath enabled, waiting up to 500 seconds for multipath device creation
. [ OK ]
Starting /usr/cvfs/bin/fsmpm .........
net.core.rmem_max = 131071
Starting /usr/cvfs/bin/cvfsd ... [ OK ]
Mounting the shared file system: HA_shared
Waiting for primary
Waiting for CVFS mounts to complete [ OK ]
SNFS Initialized [ OK ]
#
Notes |
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |