SNFS with Solaris Cluster Clients |
SNFS with Solaris Clusters
When SNFS is installed on an Oracle Solaris Cluster client (aka Sun Cluster) care should be taken to disable Global Fencing on the Cluster. Failure to do so may mean all SNFS clients and the MDCs will lose access to the SNFS shared disks when the Solaris Cluster shuts down, reboots or partitions. Global Fencing is enabled by default in a typical Cluster installation.
Global Fencing is a mechanism to ensure data integrity on shared disks within the Cluster and causes nodes in the Cluster to block access to disk LUNs using SCSI reservations in some circumstances. Unfortunately this also denies access to the SNFS disk LUNs by other clients and the MDCs resulting in hangs and failovers.
To check the setting use the “cluster show” command on the Solaris Cluster :
# cluster show
=== Cluster ===
Cluster Name: <name>
clusterid: <ID>
...
global_fencing: global
See the Oracle Solaris Cluster documentation for more information, specifically :
Global Fencing section of the Sun Cluster Software Installation Guide for Solaris OS : https://docs.oracle.com/cd/E19787-01/820-4677/ggxlu/index.html
Sun Cluster Concepts guide : https://docs.oracle.com/cd/E19050-01/sun.cluster31/816-3383/caccajda/index.html
Failure to disable the option will typically result in messages on the SNFS MDCs reporting I/O errors, reservation conflicts and lost disks. EG :
fsmpm[14664]: compare_names: WARNING! Unable to find raw device /dev/sdr from current disk scan in newly created list.
kernel: end_request: I/O error, dev sdbs, sector 0
kernel: sd 5:0:0:13: reservation conflict
kernel: sd 5:0:0:13: SCSI error: return code = 0x00000018
In the messages files on the Solaris Cluster nodes you may also see the following entries coinciding with the problem :
fencing node <node name> from shared devices
reservation message(fence_node) - Fencing node <Node no.> from disk <disk device>
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |