Hardware: How to Identify and Interpret SCSI Sense Codes on a DXi Platform |
This article provides information on how to analyze SCSI sense codes on a DXi system.
When a target receives a SCSI command from an initiator, a return code is generated, which shows the result of the command. This return code is called a SCSI Sense Code (or SCSI Sense Data). For a full list of SCSI Command Operation Codes, see CDB Structure Explained, which describes the SCSI Command Descriptor Block (CDB) structure that displays in the messages file.
SCSI Sense Codes consist of the following fields:
This is the completion status of the SCSI command. A code of 00h means that the command was successfully processed by the target. The possible Status Codes are:
Code |
Name |
00h |
GOOD |
02h |
CHECK CONDITION |
04h |
CONDITION MET |
08h |
BUSY |
18h |
RESERVATION CONFLICT |
28h |
TASK SET FULL |
30h |
ACA ACTIVE |
40h |
TASK ABORTED |
If the command isn't processed successfully and returns the CHECK CONDITION code, the target will provide additional information in the Sense Key, ASC, and ASCQ fields:
Code |
Name |
0h |
NO SENSE |
1h |
RECOVERED ERROR |
2h |
NOT READY |
3h |
MEDIUM ERROR |
4h |
HARDWARE ERROR |
5h |
ILLEGAL REQUEST |
6h |
UNIT ATTENTION |
7h |
DATA PROTECT |
8h |
BLANK CHECK |
9h |
VENDOR SPECIFIC |
Ah |
COPY ABORTED |
Bh |
ABORTED COMMAND |
Dh |
VOLUME OVERFLOW |
Eh |
MISCOMPARE |
Fh |
COMPLETED |
The list of possible results is very long. For a list of the standard SCSI codes used by all vendors, see http://www.t10.org/lists/asc-num.txt.
SCSI Sense Codes will be fully listed in the controller logs. Depending on your DXi platform, you may want to look at the PERC logs (DSET logs, such as those available on the DXi4500 and DXi8500) or the 3ware logs (avilable on DXi67xx systems), or in the Netapp array log (available on the DXi7500 and 8500). These logs will have various formats and messages, but they will all report the SCSI sense code, if it is other than 00h.
Sep 18 10:35:37 txslnnodxi85001 Server Administrator: Storage Service EventID: 2095 Unexpected sense. SCSI sense data: Sense key: 5 Sense code: 24 Sense qualifier: 0: Physical Disk 0:0:5 Controller 0, Connector 0
Sep 21 05:23:37 txslnnodxi85001 Server Administrator: Storage Service EventID: 2095 Unexpected sense. SCSI sense data: Sense key: 3 Sense code: 11 Sense qualifier: 1: Physical Disk 0:0:8 Controller 0, Connector 0
May 3 17:18:40 DXi141M2 kernel: sdw: Current: sense key: Recovered Error
May 3 17:18:40 DXi141M2 kernel: <<vendor>> ASC=0xe0 ASCQ=0xbASC=0xe0 ASCQ=0xb
Jul 15 18:39:19 chsddcueyvtl03 kernel: 138 [RAIDarray.mpp]Qarray1:1:1:15 Medium error, ASC/ASCQ 0x11/0x0
Jul 15 18:39:19 chsddcueyvtl03 kernel: 492 [RAIDarray.mpp]Qarray1:1:1:15 IO FAILURE. vcmnd SN 2381510460 pdev H11:C0:T1:L15 0x03/0x11/0x00 0x08000002 mpp_status:1
Jul 15 18:39:19 chsddcueyvtl03 kernel: sd 14:0:0:15: SCSI error: return code = 0x08000002
Jul 15 18:39:19 chsddcueyvtl03 kernel: sdq: Current: sense key: Medium Error
Jul 15 18:39:19 chsddcueyvtl03 kernel: Add. Sense: Unrecovered read error
Aug 2 22:53:31 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x0009): Drive timeout detected:encl=1, slot=9.
Aug 2 22:53:36 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completed:encl=1, slot=9, LBA=0x4E29C016.
Aug 2 22:53:44 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completed:encl=1, slot=9, LBA=0x56F1560C.
Aug 2 22:53:51 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completed:encl=1, slot=9, LBA=0x56F1568A.
Aug 2 22:54:01 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completed:encl=1, slot=9, LBA=0x56F15819.
Aug 2 22:54:07 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completed:encl=1, slot=9, LBA=0x56F15A04.
Aug 2 22:54:15 uddcbpdxi01 kernel: 3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completed:encl=1, slot=9, LBA=0x66159802.
Because different DXi systems may use different disk controllers (as shown above), when you do a Storage or DSET collect, you must be familiar with each controller log. The following subsections give examples of how various controllers report SCSI Sense Codes.
2.1. LSI 3ware Controller
//uddcbpdxi01> /c1 show diag
### Time Stamp: 14:35:48 08-Aug-2012
### Host Name: uddcbpdxi01
### Host Architecture: x86_64 (64 bit)
### OS Version: Linux 2.6.18-164.15.1.qtm.4
### Model: 9690SA-8E
### Serial #: L340405B1240392
### Controller ID: 1
### CLI Version: 2.00.11.021
### API Version: 2.08.00.025
### Driver Version: 2.26.08.005-2.6.18
### Firmware Version: FH9X 4.08.00.022
### BIOS Version: BE9X 4.08.00.001
### Available Memory: 448MB
==========================================================================
Diagnostic Information on Controller //uddcbpdxi01/c1 ...
--------------------------------------------------------------------------
Event Trigger and Log Information:
Triggered Event(s) =
ctlreset (controller soft reset)
fwassert (firmware assert)
driveerr (drive error)
Trigger event counter for ctlrreset = N/A
Trigger event counter for fwassert = N/A
Trigger event counter for driveerr = N/A
Diagnostic trigger event counters are not supported.
--------------------------------------------------------------------------
: Invalid command opcode (EC:0x101, SK=0x05, ASC=0x20, ASCQ=0x00, SEV=01, Type=0x70) opcode=0x3C (SAF-TE)
Error, Unit 65: Invalid command opcode
(EC:0x101, SK=0x05, ASC=0x20, ASCQ=0x00, SEV=01, Type=0x70)
opcode=0x3C (SAF-TE)
param : Table 0x000B, param 0x02, size 2
Legacy opcode=0x52 error=0x10B
E=010B T=14:41:01 : Parameter table does not exist
E=010B T=14:41:01 U=0 : Return error status to host
Note that the information above was obtained at the tw_cli prompt by giving the command '/c1 show diag'. This same information can be found under the storage collect for the 6xxx platforms.
In the example above, we have:
SK (sense key) = 05 = illegal request
ASC/ASCQ = 20/00 = invalid command operation code
Opcode = 3C = read buffer
The Opcode is the operation that resulted in the reported sense codes. For information about Opcodes, see CDB Structure Explained.
Here, we have a 'read buffer' command that failed, because of an invalid command operation code.
2.3. NetApp Array Controller
This example was collected from the MEL (Major Event Log), which is gathered by collect, and also by the Storage Log, on the 75xx and 85xx:
Storage Array 3a:
Date/Time: 3/15/12 10:34:06 AM
Sequence number: 3291
Event type: 100A
Event category: Error
Priority: Informational
Description: Drive returned CHECK CONDITION Event specific codes: b/88/3 Component type: Drive Component location: Tray 0, Slot 4 Logged by: Controller in slot B
In the message above we have:
SK (sense key) = 0B= aborted command
ASC/ASCQ = 88/03 = ???
Opcode = ??
Note that the description field won't provide the operation code. This information will be available on the mini hexdump, which will require advanced skills in NetApp MEL analysis.
Also, notice that he ASC/ASCQ code 88/03 isn't listed in the t10.org web page. This is because hardware vendors can create vendor-specific sense codes that apply only to their hardware.
To find information about the vendor-specific sense codes and opcodes for NetApp arrays, consult NetApp, or your senior (or backline) engineer.
2.4. Dell PERC Controller
For each PERC controller, you'll find a file named Controller_#.log. In th example shown below:
● The example was taken from a 85xx that has PERC controllers H700 and H800.
● The data bellow was collected from the DSET log file: /data/dell/RAID Controllers/Controller_0.log
PERC H700 Integrated 0:
80 00 3f 00 28 55 06 01 02 00 c0 00 00 00 87 03 c0 00 00 00 21 1a 40 00 14 c0 c0 00 00 00 a4 ff ff
08/20/13 7:05:25: EVT#11880-08/20/13 7:05:25: 113=Unexpected sense: PD 08(e0x20/s8) Path 500000e118c68f02, CDB: 28 00 00 09 aa a0 00 00 20 00, Sense: 3/11/01
08/20/13 7:05:25: Raw Sense for PD 8: f0 00 03 00 09 aa a4 28 00 00 00 00 11 01 00 80 00 3f 00 28 55 06 01 02 00 c0 00 00 00 87 03 c0 00 00 00 21 1a 40 00 14 c0 c0 02 00 00 88 ff ff
08/20/13 7:05:25: DEV_REC:Medium Error DevId[8] devHandle e RDM=807f2e00 retires=0
In the example above, we have the following:
SK (sense key) = 05 =
ASC/ASCQ = 11/01 = read retries exhausted
Opcode = 28= read
The Opcode was collected from the CDB field (for details, see CDB Structure Explained).
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |