Deleting an LSI/3ware "Ghost" Raid Unit After a Controller Malfunction |
SR Information: SR1609118.
Product / Software Version: Issue found on DXi6500 with 2.2.1.2 software. However, it can affect other DXi platforms that use the LSI/3ware controllers, such as the 9690 and 9750.
Problem Description: Five 1-TB drives show degraded state across multiple controllers after monthly verification(s) were completed, and hwmond was having problems talking to the 3ware controllers.
Reference (possible): PTRs 30615, 32638, 32818, and 3480 |
Correct RAID array “ghost/invalid” units and upgrade FW version from 2.2.1.2 to 2.2.13, as detailed below.
Working with RAID sets can be a very sensitive task. If you need assistance for some reason, or you have any questions, please contact Service Engineering before running commands like the ones in this article. Take extra precautions when deleting units.
This article gives procedures for analyzing and solving a problem created on a DXi system when an LSI/3ware "ghost" RAID unit was created after a controller malfunction. The main sections are as follows:
1.1 Identifying the Problem
1.2 DXi Reboot and Sample Error Listings
1.3 Communication Errors in the Logs
1.4 3ware Controller Errors and Drive Errors
1.5 RAS Alerts Issued After Bootup
2.0 Identifying and Fixing the Problem
2.1 Identifying the Correct (Valid Units) RAIDs for Each Controller
2.2 Identifying the Correct Number of Volumes (Valid Units) per LSI/3ware Controller Card
2.3 Identifying and Fixing Incorrect or Foreign Volumes, Testing Drives, and Adding Back the Good Drives
2.4 Checking to Ensure That All Is OK
3.0 Requesting Additional Assistance
A “near” DCB problem was encountered due to the 3ware controller being severely busy, and hwmod failed to communicate with the 3ware controllers.
Our priorities are the following:
First, make sure the DXi is not in a reboot loop:
The DXi will now reboot several times. The boldfaced items below explain the detailed listings that follow them.
Before the DXi reboots, you will see several tw_cli page allocation errors:
Sep 10 14:40:24 si-bkupdedup05 kernel: tw_cli: page allocation failure. order:0, mode:0x10d0
Sep 10 14:40:24 si-bkupdedup05 kernel:
Sep 10 14:40:24 si-bkupdedup05 kernel: Call Trace:
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff8000f504>] __alloc_pages+0x2b5/0x2ce
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff800728fb>] dma_alloc_pages+0xa3/0x106
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff8002207f>] dma_alloc_coherent+0x79/0x1c3
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff880b7fa4>] :3w_9xxx:twa_chrdev_ioctl+0xc6/0x674
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff8015b635>] list_add+0xc/0xe
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff800496a1>] chrdev_open+0x0/0x183
Sep 10 14:40:24 si-bkupdedup05 kernel: [<ffffffff80042262>] do_ioctl+0x55/0x6b
Sep 10 14:40:25 si-bkupdedup05 kernel: [<ffffffff80030306>] vfs_ioctl+0x457/0x4b9
Sep 10 14:40:25 si-bkupdedup05 kernel: [<ffffffff800b85fd>] audit_syscall_entry+0x180/0x1b3
Sep 10 14:40:25 si-bkupdedup05 kernel: [<ffffffff8004c97d>] sys_ioctl+0x59/0x78
Sep 10 14:40:25 si-bkupdedup05 kernel: [<ffffffff8005e28d>] tracesys+0xd5/0xe0
Due to the communication errors, you will start seeing errors in the logs:
mountd[30827]: export request from 127.0.0.1 fails.
Sep 10 15:01:03 si-bkupdedup05 kernel: Kernel logging (proc) stopped.
Sep 10 15:01:03 si-bkupdedup05 kernel: Kernel log daemon terminating.
Sep 10 15:01:04 si-bkupdedup05 exiting on signal 15
The last 3ware verification is now complete:
Sep 10 19:01:03 si-bkupdedup05 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Sep 10 19:27:05 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x002B): Verify completed:unit=1.
Sep 10 19:30:05 si-bkupdedup05 kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x002B): Verify completed:unit=3.
Sep 10 19:40:31 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x002B): Verify completed:unit=1.
The DXi gets a request to shut down:
Sep 10 20:01:03 si-bkupdedup05 kernel: Kernel logging (proc) stopped.
Sep 10 20:01:03 si-bkupdedup05 kernel: Kernel log daemon terminating.
Sep 10 20:01:04 si-bkupdedup05 exiting on signal 15
When the DXi reboots, many errors are seen on several 3ware controllers and drives on (c1, C2 and C3):
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=0.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=1.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=2.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=3.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=4.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=5.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=6.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=7.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=8.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=9.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=10.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=11.
Sep 11 11:33:49 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0062): Enclosure removed:encl=0.
Sep 11 11:33:50 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=2, vport=30.
Sep 11 11:33:50 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=2, vport=25.
Sep 11 11:33:50 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=0, vport=9.
Sep 11 11:33:55 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=0.
Sep 11 11:33:55 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=1.
Sep 11 11:33:55 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=2.
Sep 11 11:33:55 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=3.
Sep 11 11:33:55 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=4.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=5.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=6.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=7.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=8.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=9.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=10.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=11.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: WARNING (0x04:0x0062): Enclosure removed:encl=0.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: ERROR (0x04:0x0002): Degraded unit:unit=1, vport=19.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: ERROR (0x04:0x0002): Degraded unit:unit=1, vport=18.
Sep 11 11:33:56 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: ERROR (0x04:0x0002): Degraded unit:unit=0, vport=9.
Sep 11 11:34:02 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=0.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=1.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=2.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=3.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=4.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=5.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=6.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=7.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=8.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=9.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=10.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0019): Drive removed:encl=0, slot=11.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: WARNING (0x04:0x0062): Enclosure removed:encl=0.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: ERROR (0x04:0x0002): Degraded unit:unit=1, vport=19.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: ERROR (0x04:0x0002): Degraded unit:unit=1, vport=18.
Sep 11 11:34:03 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: ERROR (0x04:0x0002): Degraded unit:unit=0, vport=9.
Sep 11 11:34:10 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x001E): Unit inoperable:unit=0.
Sep 11 11:34:10 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x001E): Unit inoperable:unit=2.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=0.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=1.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=2.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=3.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=4.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=5.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=6.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=7.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=8.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=9.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=10.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0019): Drive removed:encl=1, slot=11.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0062): Enclosure removed:encl=1.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=3, vport=31.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=3, vport=29.
Sep 11 11:34:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=1, vport=12.
Sep 11 11:34:16 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: ERROR (0x04:0x001E): Unit inoperable:unit=0.
Sep 11 11:34:16 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: ERROR (0x04:0x001E): Unit inoperable:unit=1.
Sep 11 11:34:23 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: ERROR (0x04:0x001E): Unit inoperable:unit=0.
Sep 11 11:34:23 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: ERROR (0x04:0x001E): Unit inoperable:unit=1.
As the DXi continues to boot, additional drives are detected by the 3ware controller(s). This can be an indication that the drives were not ready when the controller scanned, a cable connectivity problem, a possible bad/slow drive, or a drive with errors that may need to be replaced. You can look at the 3ware Controller event logs to determine if a drive needs to be replaced.
Sep 11 11:37:17 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=11.Sep 11 11:37:19 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001F): Unit operational:unit=3.
Then the DXi rescans and finds the original units, PLUS additional units (some drives that it found to have a signature but not enough information to determine if they are for an existing RAID/unit or a foreign unit, so it identifies them as foreign and assigns the following unit number of Ux). These extra units will not have enough drives to make it a RAID6 or 1x2 mirror, so they will be indentified as “inoperable” as they are incomplete!
Sep 11 11:35:25 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x0063): Enclosure added:encl=0.
Sep 11 11:35:26 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=0.
Sep 11 11:35:27 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=1.
Sep 11 11:35:27 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001F): Unit operational:unit=0.
Sep 11 11:35:33 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=2.
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=3.
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=4.
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=5.
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=7.
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=8.
Sep 11 11:35:36 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=9.
Sep 11 11:35:37 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=10.
Sep 11 11:35:37 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=11.
Sep 11 11:35:37 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001F): Unit operational:unit=2.
Sep 11 11:35:41 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=6.
Sep 11 11:35:43 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001F): Unit operational:unit=2.
Sep 11 11:35:43 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:35:53 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:36:00 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x0063): Enclosure added:encl=0.
Sep 11 11:36:02 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=0.
Sep 11 11:36:02 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=1.
Sep 11 11:36:02 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001F): Unit operational:unit=0.
Sep 11 11:36:03 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=2.
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=3.
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=5.
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=6.
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=7.
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=9.
Sep 11 11:36:10 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=10.
Sep 11 11:36:11 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=11.
Sep 11 11:36:11 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001F): Unit operational:unit=1.
Sep 11 11:36:13 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:36:16 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=8.
Sep 11 11:36:17 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001F): Unit operational:unit=1.
Sep 11 11:36:18 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=4.
Sep 11 11:36:20 si-bkupdedup05 kernel: 3w-9xxx: scsi3: AEN: INFO (0x04:0x001F): Unit operational:unit=1.
Sep 11 11:36:23 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:36:30 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x0063): Enclosure added:encl=0.
Sep 11 11:36:32 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=0.
Sep 11 11:36:32 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=1.
Sep 11 11:36:32 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001F): Unit operational:unit=0.
Sep 11 11:36:33 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=2.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=3.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=4.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=5.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=6.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=7.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=9.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=10.
Sep 11 11:36:40 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=11.
Sep 11 11:36:41 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001F): Unit operational:unit=1.
Sep 11 11:36:43 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:36:49 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001A): Drive inserted:encl=0, slot=8.
Sep 11 11:36:50 si-bkupdedup05 kernel: 3w-9xxx: scsi2: AEN: INFO (0x04:0x001F): Unit operational:unit=1.
Sep 11 11:36:53 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:37:02 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x0063): Enclosure added:encl=1.
Sep 11 11:37:03 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=0.
Sep 11 11:37:03 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=1.
Sep 11 11:37:03 si-bkupdedup05 cvlabel: using /usr/cvfs/config/raid-strings for raid type information
Sep 11 11:37:04 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001F): Unit operational:unit=1.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=2.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=3.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=4.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=5.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=6.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=7.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=8.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=9.
Sep 11 11:37:12 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001A): Drive inserted:encl=1, slot=10.
Sep 11 11:37:13 si-bkupdedup05 kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x001F): Unit operational:unit=3.
When this happened, the DXi was shut down to get some manual assistance. We had the QFE reseat all 3ware HBAs, to ensure a good connection.
Sep 11 11:39:12 si-bkupdedup05 shutdown[11273]: shutting down for system halt
Sep 11 11:39:12 si-bkupdedup05 init: Switching to runlevel: 0
Sep 11 11:39:13 si-bkupdedup05 xinetd[10044]: Exiting...
After bootup, the following RAS alerts are issued, indicating multiple RAID/unit failures and drive failures:
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00021>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 21 VINST: C1E0 VPINST: C1E0 EVENT: 7 TEXT: The RAID chassis C1E0 has failed. Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00023>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 23 VINST: C1E0SLT6 VPINST: C1E0 EVENT: 118 TEXT: [Hitachi HUA722010CLA330] Needs replacement or has been replaced and is being rebuilt.
Some drives are seen as foreign units, so the raidsets that the drives belonged to show up as “degraded” or “inoperable” ...
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00070>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 70 VINST: C1U1V0 VPINST: C1E0 EVENT: 115 TEXT: DEGRADED
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00070>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 70 VINST: C1U3V0 VPINST: C1E0 EVENT: 115 TEXT: DEGRADED
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00021>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 21 VINST: C1E1 VPINST: C1E1 EVENT: 7 TEXT: The RAID chassis C1E1 has failed.
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00023>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 23 VINST: C1E1SLT11 VPINST: C1E1 EVENT: 118 TEXT: [WDC WD1002FBYS-02A6B0] Needs replacement or has been replaced and is being rebuilt.
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00070>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 70 VINST: C1U1V0 VPINST: C1E1 EVENT: 115 TEXT: DEGRADED
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00070>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 70 VINST: C1U3V0 VPINST: C1E1 EVENT: 115 TEXT: DEGRADED
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00021>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 21 VINST: C2E0 VPINST: C2E0 EVENT: 7 TEXT: The RAID chassis C2E0 has failed.
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00023>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 23 VINST: C2E0SLT8 VPINST: C2E0 EVENT: 118 TEXT: [Hitachi HUA722010CLA330] Needs replacement or has been replaced and is being rebuilt.
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00070>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 70 VINST: C2U1V0 VPINST: C2E0 EVENT: 115 TEXT: DEGRADED
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00021>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 21 VINST: C3E0 VPINST: C3E0 EVENT: 7 TEXT: The RAID chassis C3E0 has failed.
...and some foreign drives are identified:
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00023>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 23 VINST: C3E0SLT4 VPINST: C3E0 EVENT: 118 TEXT: [Hitachi HUA722010CLA330] Needs replacement or has been replaced and is being rebuilt.
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00023>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 23 VINST: C3E0SLT8 VPINST: C3E0 EVENT: 118 TEXT: [WDC WD1002FBYS-02A6B0] Needs replacement or has been replaced and is being rebuilt.
Sep 11 12:15:35 si-bkupdedup05 hwmond: E0000(1)<00070>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 70 VINST: C3U1V0 VPINST: C3E0 EVENT: 115 TEXT: DEGRADED
This section shows how you can identify the hardware that is causing the problem, and apply a fix.
On any DXi6xxx with 3ware controllers, the node will have the four controllers shown below, by default. You can see this by giving the command
/opt/DXi/3ware/tw_cli /c0 show
Results:
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-----------------------------------------------------------------------------
u0 RAID-1 OK - - - 931.312 RiW OFF
u1 RAID-1 OK - - - 55.8691 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 OK - - 256K 7450.5 RiW OFF
NOTE: All drives are on C0E0
Controllers 1, 2 and 3 may have multiple enclosures, as in the examples below.
C1 has 4 units, 2 for each array/EM.
You can see this for C1 by giving the following command:
/opt/DXi/3ware/tw_cli /c1 show
Results:
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 OK - - 256K 7450.5 RiW OFF
NOTE: This controller has two enclosures, E0 and E1, so any foreign incomplete/inoperable units would follow with U4, U5 etc.
NOTE: All port listings have been cut to make this document shorter.
VPort Status Unit Size Type Phy Encl-Slot Model
p8 OK u0 59.62 GB SATA - /c1/e0/slt0 SSDSA2SH064G1GC INT
p10 OK u2 59.62 GB SATA - /c1/e1/slt0 SSDSA2SH064G1GC INT
p12 OK u1 931.51 GB SATA - /c1/e0/slt2 Hitachi HUA722010CL
p13 OK u3 931.51 GB SATA - /c1/e1/slt2 Hitachi HUA722010CL
You can see this for C2 by giving the following command:
/opt/DXi/3ware/tw_cli /c2 show
Results: One Array/EM, only two units
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
You can see this for C3 by giving the following command:
/opt/DXi/3ware/tw_cli /c3 show
Results: One Array/EM , only two units
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
U1 RAID-6 OK - - 256K 7450.5 RiW OFF
Please note the following:
● 1 (one) 1x2 mirror (60G, 100G or 200G SSD drives) (SSD)
● 1 (one) 1x10 RAID6 (1TB, 2TB or 3TB drives) (DATA)
A controller with 1 Array/EM will have U0 and U1.
● Array/EM #1 U0 and U1
● Array/EM #2 U2 and U3
At this point, we must identify and fix the incorrect or foreign volumes (invalid units) per LSI/3ware controller card, determine if the drives are still good, and if so, add them back to the corresponding raidset(s).
In this SR example, the following volumes per 3ware controller card were Identified to be foreign:
Controller: C1
Problem: P30 and P31 became U4
Action: Need to delete unit U4 and make drives part of U3
Command given: (before fix):
/opt/DXi/3ware/tw_cli /c1 show
Results:
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 DEGRADED - 256K 7450.5 RiW OFF #two drives missing P30 and P31
u4 RAID-6 INOPERABLE - 256K 7450.5 Ri OFF #should not exist
~
P30 and P31 should have been part of U3, but as they were identified as a “foreign unit” and were labeled as U4:
p30 OK u4 931.51 GB SATA - /c1/e0/slt6 Hitachi HUA722010CL
p31 OK u4 931.51 GB SATA - /c1/e1/slt11 WDC WD1002FBYS-02A6
From the output above, we know the following:
● C1 has two EMs on it, so there should only be U0, U1, U2 and U3.
● U4 has ONLY two drives in a RAID 6 configuration, so this unit is NOT part of the original units or raidsets.
To troubleshoot this issue, first look at the 3ware logs and make sure that there were no errors or issues with the two drives on Ports p30 and p31. From the RCA done on this SR, we determined that this was due to the FW version of v22 on the 3ware controllers, which was corrected in FW 2.2.1.3. So, a FW uprade was requested and performed.
Knowing this, and since no errors were found in the logs, we then did the following:
1. Remove the drive in question – this will remove the drive from the controller and will NOT keep the DCB information, meaning that it will become a “new” drive.
/opt/DXi/3ware/tw_cli /c1/p30 remove
/opt/DXi/3ware/tw_cli /c1 show
/opt/DXi/3ware/tw_cli /c1 rescan
The drive will then show up as follows:
p30 - u1 931.51 GB SATA - /c1/e0/slt6 Hitachi HUA722010CL
2. Add the drive back into the RAID that it belongs to:
/opt/DXi/3ware/tw_cli /c1/u1 start rebuild disk=30
3. Rescan to ensure that no other drives become foreign or fail:
/opt/DXi/3ware/tw_cli /c1 rescan
/opt/DXi/3ware/tw_cli /c1 show
4. When you see the drive “rebuilding" into U3, you can do the same thing with drive in p31:
/opt/DXi/3ware/tw_cli /c1/p31 remove
/opt/DXi/3ware/tw_cli /c1 show
The drive will show up as follows:
p31 OK - 931.51 GB SATA - /c1/e0/slt6 Hitachi HUA722010
5. Add the drive back into the raid that it belongs to.
/opt/DXi/3ware/tw_cli /c1/u1 start rebuild disk=31
6. Rescan to ensure that no other drives become foreign or fail, and that the two drives are “rebuilding”.
/opt/DXi/3ware/tw_cli /c1 rescan
7. Take a look at the results.
/opt/DXi/3ware/tw_cli /c1 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 REBUILDING 2% - 256K 7450.5 RiW OFF
Note: No U4 present any longer
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
p30 DEGRADED u3 931.51 GB SATA - /c1/e0/slt6 Hitachi HUA722010CL
p31 DEGRADED u3 931.51 GB SATA - /c1/e1/slt11 WDC WD1002FBYS-02A6
Controller: C2
Problem: P19 became U3
Action: Need to remove U3 drive and make part of U2
BEFORE:
/opt/DXi/3ware/tw_cli /c2
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 DEGRADED - - 256K 7450.5 RiW OFF
u2 RAID-6 INOPERABLE - - 256K 7450.5 RiW OFF
------------------------------------------------------------------------------
p18 OK u1 931.51 GB SATA - /c2/e0/slt11 Hitachi HUA722010CL
p19 OK u2 931.51 GB SATA - /c2/e0/slt8 Hitachi HUA722010CL
1. Remove the drive in question – this will remove the drive from the controller and will NOT keep the DCB information, meaning that this drive will become a “new” drive.
/opt/DXi/3ware/tw_cli /c2/p19 remove
/opt/DXi/3ware/tw_cli /c2 show
Drive will show up as:
p19 OK - 931.51 GB SATA - /c2/e0/slt8 Hitachi HUA722010CL
2. Add the drive back into the raid that it belongs to.
/opt/DXi/3ware/tw_cli /c2/u1 start rebuild disk=19
3. Rescan to ensure that no other drives become foreign or fail and that the drive are “rebuilding”.
/opt/DXi/3ware/tw_cli /c2 rescan
AFTER:
/opt/DXi/3ware/tw_cli /c2 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 REBUILDING 2% - 256K 7450.5 RiW OFF
Note: U2 is no longer present
------------------------------------------------------------------------------
p18 OK u1 931.51 GB SATA - /c2/e0/slt11 Hitachi HUA722010CL
p19 DEGRADED u2 931.51 GB SATA - /c2/e0/slt8 Hitachi HUA722010CL
Controller: C3
Problem: P18 and P19 became U3
Action: Need to remove U3 disks and make them part of U2
BEFORE:
/opt/DXi/3ware/tw_cli /c3 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
u3 RAID-6 INOPERABLE - - 256K 7450.5 Ri OFF
~
P18 and P19 should have been part of U2, but as they were identified as a “foreign unit” they were labeled as U3:
p18 OK u3 931.51 GB SATA - /c3/e0/slt8 WDC WD1002FBYS-02A6
p19 OK u3 931.51 GB SATA - /c3/e0/slt4 Hitachi HUA722010CL
From the output above, we know that C3 has one EM on it, so there should only be U0 and U1. U2 has ONLY two drives in a RAID 6 configuration, so this unit is NOT part of the original units or raid sets.
First, look at the 3ware logs, and make sure that there were no errors or issues with the two drives on Port p18 and p19. From the RCA done on this SR, we determined that the drives were incorrectly identified because the 3ware controllers had FW v22. This problem was corrected on FW 2.2.1.3, so a FW upgrade was requested and performed. Knowing this, and since no errors were found in the logs, we proceeded to do the following.
1. Remove the drive in question – this will remove the drive from the controller and will NOT keep the DCB information, so this drive will become a “new” drive.
/opt/DXi/3ware/tw_cli /c3/p18 remove
/opt/DXi/3ware/tw_cli /c3 show
/opt/DXi/3ware/tw_cli /c3 rescan
2. Add the drive back into the RAID that it belongs to.
/opt/DXi/3ware/tw_cli /c3/u1 start rebuild disk=18
3. Rescan to ensure that no other drives become foreign or fail.
/opt/DXi/3ware/tw_cli /c3 show
/opt/DXi/3ware/tw_cli /c3 show
The drive will show up as:
p18 OK - 931.51 GB SATA - /c3/e0/slt8 WDC WD1002FBYS-02A6
4. Once you see the drive “rebuilding" into U2, you can do the same thing with p19.
/opt/DXi/3ware/tw_cli /c3/p19 remove
/opt/DXi/3ware/tw_cli /c3 show
The drive will show up as:
p19 OK - 931.51 GB SATA - /c3/e0/slt4 WDC WD1002FBYS-02A6
5. Add the drive back into the RAID it belongs to.
/opt/DXi/3ware/tw_cli /c3/u1 start rebuild disk=19
6. Rescan to ensure that no other drives become foreign or fail, and that the two drives are “rebuilding".
/opt/DXi/3ware/tw_cli /c3 rescan
AFTER:
/opt/DXi/3ware/tw_cli /c1 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 REBUILDING 2% - 256K 7450.5 RiW OFF
Note: No U4 is no longer present
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
p18 DEGRADED u1 931.51 GB SATA - /c3/e0/slt8 WDC WD1002FBYS-02A6
p19 DEGRADED u1 931.51 GB SATA - /c3/e0/slt4 Hitachi HUA722010CL
As a final check, run a “show” command against all controllers to ensure that all is OK:
/opt/DXi/3ware/tw_cli /c0 show
NOTE: Make sure that all units have an "OK", "REBUILDING", or INITIALIZE status.
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 931.312 RiW OFF
u1 RAID-1 OK - - - 55.8691 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 OK - - 256K 7450.5 RiW OFF
All Drives should show an “OK” status (Only port P8 is shown here. There can be as many as 31 ports, depending on the number of Arrays/EM’s):
VPort Status Unit Size Type Phy Encl-Slot Model
------------------------------------------------------------------------------
p8 OK u1 59.62 GB SATA - /c0/e0/slt5 SSDSA2SH064G1GC INT
~
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
---------------------------------------------------------------------------
bbu On Yes OK OK OK 0 xx-xxx-xxxx
Look at the status for C1:
/opt/DXi/3ware/tw_cli /c1 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
u2 RAID-1 OK - - - 55.8691 RiW OFF
u3 RAID-6 OK - - 256K 7450.5 RiW OFF
All Drives should show an “OK” or "DEGRADED" status if the unit is being rebuilt (Only port P8 is shown here. There can be as many as 31 ports, depending on the number of Arrays/EM’s):
VPort Status Unit Size Type Phy Encl-Slot Model
------------------------------------------------------------------------------
p8 OK u0 59.62 GB SATA - /c1/e0/slt0 SSDSA2SH064G1GC INT
~
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
---------------------------------------------------------------------------
bbu On Yes OK OK OK 0 xx-xxx-xxxx
Look at the status for C2:
/opt/DXi/3ware/tw_cli /c2 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
All Drives should show an “OK” status (Only port P8 is shown here. There can be as many as 31 ports, depending on the number of Arrays/EM’s):
VPort Status Unit Size Type Phy Encl-Slot Model
------------------------------------------------------------------------------
p8 OK u0 59.62 GB SATA - /c2/e0/slt0 SSDSA2SH064G1GC INT
~
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
---------------------------------------------------------------------------
bbu On Yes OK OK OK 0 xx-xxx-xxxx
Look at the status for C3:
/opt/DXi/3ware/tw_cli /c3 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-1 OK - - - 55.8691 RiW OFF
u1 RAID-6 OK - - 256K 7450.5 RiW OFF
All Drives should show an “OK” status (Only port P8 is shown here. There can be as many as 31 ports, depending on the number of Arrays/EM’s):
VPort Status Unit Size Type Phy Encl-Slot Model
------------------------------------------------------------------------------
p8 OK u0 59.62 GB SATA - /c3/e0/slt0 SSDSA2SH064G1GC INT
~
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
---------------------------------------------------------------------------
bbu On Yes OK OK OK 0 xx-xxx-xxxx
NOTE: Don’t forget to issue the command “chkconfig heartbeat on” before you reboot the DXi. This will ensure that everything comes up normally and will ensure that when the DXi is rebooted by the customer, it will come online all the way up and does not go into diagnostics mode.
If you need further assistance, please contact Service Engineering before running any commands in question. Take extra precautions when you delete units. If you accidentally delete a unit, it will be NON-RECOVERABLE, and data loss will be occur!
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |