File System Types
HA is turned on by default for all StorNext distributions, but has no effect unless FSMs request to be monitored. File system monitoring is controlled by a file-system configuration item named HaFsType
. Each file system is one of three types: HaUnmanaged
, HaManaged
or HaShared
. The HaFsType
value is read by FSMs to direct them to set up appropriate HAmon behaviors, and it is read by the FSMPM to control how it starts FSMs.
HaUnmanaged
Each unmanaged-file-system FSM starts an instance of the HAmon timer for itself when it first brands the ARB. Before it changes any metadata, an activating FSM waits for the timer interval plus a small amount of time to elapse. The interval for a usurping FSM begins with the last time the FSM reads new data in the ARB from a previously active FSM.
Unmanaged FSMs can be active on either server in the HA Cluster. They can be usurped and fail over without a system reset if they exit before the timer expires. The timer interval for an active FSM restarts with each update of the ARB.
HaManaged
Managed-file-system FSMs do not start HAmon timers, and they do not wait the HAmon interval when usurping. The FSMPMs only start Managed FSMs on the Primary server, so there is no risk of split-brain scenario. In the event that a Managed FSM exits without having been stopped by the FSMPM, it is automatically restarted after a ten-second delay and activated. The cvadmin tool's FSMlist
command displays the blocked FSMs on non-Primary servers. There can be zero or more HaManaged file systems configured.
HaShared
The shared file system is an unmanaged StorNext file system that plays a controlling role in protecting shared resources. It has the same HA behavior as other unmanaged FSMs, but it also sets a flag that triggers an HA reset when the cvfsioctl device is closed. This happens when the process exits for any reason. However, if the shared file system has been unmounted from the active server before the FSM exits, the reset-on-close flag gets turned off. This allows ordinary shutdown of CVFS and Linux without resetting the server.
When the HaShared FSM finishes activation, it sets the Primary status in its FSMPM process.
Protected shared data resides on the shared file system. Since only one FSM can activate at one time, the Primary status is able to limit the starting of management processes to a single server, which protects the data against split-brain scenario.
The starting of HaManaged FSMs is also tied to Primary status, which guarantees collocation of the managed file-system FSMs and the management processes. The GUI's data is also shared, and the GUI must be able to manipulate configuration and operational data, which requires that it be collocated with the management processes.