SR3524886 foreign SCSI Reservation on AEL500 causes reservation conflict

 

SR Information: 3524886 Lithuanian National Courts Administration

 

Problem Description: All Drives offline due to Reservation conflict

 

Product / Software Version:

M661XL + AEL500, SN Version: 5.0.1

 

AEL500  firmware version: 660G.GS007

HP LTO6 firmware version: J53Z   

 

SAN switch model: HP StorageWorks 8/8 SAN switch

Firmware version: v6.4.1

 

 Overview

Persistent Reservation on all Drives prevents TSM from preempt the Drives and put its own Reservation Key on it, taking the Drive offline due to a SCSI Reservation Conflict

 

Symptoms & Identifying the problem

 

## 1 ##  Log Review

 

 

Ras Alert:

Physical Media Changer tape drive F3A255A000 : taken offline

Detail:

Drive SN F3A255A000 encountered a persistent reservation failure and is being taken offline

 

 

 

TSM Tac Logs

Mar 31 11:27:30.744773 MDC-node1 sntsm fsconfig[12407]: E1200(7)<76936>:mui5cfg4572:  SCSI-3 reservation not created (1) (key: 0x534e00e0ed4182b6) on component V0,1 (/dev/sg80, SN:F3A255A004)

Mar 31 11:27:30.745387 MDC-node1 sntsm fsconfig[12407]: E1200(7)<76936>:mui5utl373:  SCSI-3 reservation not preempted (3) (holder: 0x4144494347574159on component V0,1 (/dev/sg80, SN: F3A255A004)

Mar 31 11:27:30.748830 MDC-node1 sntsm fsconfig[12407]: E1200(7)<76936>:mui5cfg4572:  SCSI-3 reservation not created (1) (key: 0x534e00e0ed4182b6) on component V0,2 (/dev/sg79, SN:F3A255A094)

Mar 31 11:27:30.749425 MDC-node1 sntsm fsconfig[12407]: E1200(7)<76936>:mui5utl373:  SCSI-3 reservation not preempted (3) (holder: 0x4144494347574159on component V0,2 (/dev/sg79, SN: F3A255A094)

Mar 31 11:27:30.752835 MDC-node1 sntsm fsconfig[12407]: E1200(7)<76936>:mui5cfg4572:  SCSI-3 reservation not created (1) (key: 0x534e00e0ed4182b6) on component V0,3 (/dev/sg57, SN:F3A255A000)

Mar 31 11:27:30.753445 MDC-node1 sntsm fsconfig[12407]: E1200(7)<76936>:mui5utl373:  SCSI-3 reservation not preempted (3) (holder: 0x4144494347574159on component V0,3 (/dev/sg57, SN: F3A255A000)

 

Stornext TSM - 0x534e00e0ed4182b6

Holder, different Server -0x4144494347574159

 

 

var/log/messages

Mar 31 11:27:30 MDC-node1 kernel: scsi 11:0:0:0: reservation conflict

Mar 31 11:27:30 MDC-node1 kernel: scsi 10:0:1:0: reservation conflict

Mar 31 11:27:30 MDC-node1 kernel: scsi 10:0:0:0: reservation conflict

 

 

 

## 2 ## Troubleshooting:

 

 

[root@MDC-node1 test1]# fs_scsi -R /dev/sg80
Device: /dev/sg80 (SN: F3A255A004)

-------------------------------------------------------------------------
persistent reserve IN read keys data:  length=24
          00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
          -- -- -- --  -- -- -- --  -- -- -- --  -- -- -- --
7FFF45EFC190: 00 00 00 16  00 00 00 10  41 44 49 43  47 57 41 59
7FFF45EFC1A0: 53 4e 00 e0  ed 41 82 b6
PR generation=0x16, Additional length=16
2 registered reservation key(s) follow:
   
0x4144494347574159
    0x534e00e0ed4182b6

Even though the MDC has been registered and is logged in, is not allowed to reserve the Drive.


-------------------------------------------------------------------------
persistent reserve IN read reservation data:  length=27
          00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
          -- -- -- --  -- -- -- --  -- -- -- --  -- -- -- --
7FFF45EFC190: 00 00 00 16  00 00 00 10  41 44 49 43  47 57 41 59
7FFF45EFC1A0: 00 00 00 00  00 03 00 00  36 2d 53 43
PR generation=0x16, Reservation follows:
   
Key=0x4144494347574159
Device: /dev/sg79 (SN: F3A255A094)

The customer was not able to identify the corresponding Host or WW(P)N to the registered Reservation.

 

 

 

Drive Dumps reflects what hosts are currently logged into the Drive


HP Drive Dump Analysis with HP L&TT

 

 

UDS_01_HUJ4321BVC

|__ Logged-In Host Table

    ||__ WW Node Name              | WW Port Name              |    Source ID |    Port Login Time |   Port ID |   Host ID |   Rel. ID

    ||__ 20:00:00:24:ff:78:0e:7e   | 21:00:00:24:ff:78:0e:7e   |        10e00 |     18328:21:13:04 |         0 |         1 |         1

    ||__ 20:00:00:24:ff:78:0e:82   | 21:00:00:24:ff:78:0e:82   |        10b00 |     16643:14:50:03 |         0 |         2 |         1

    ||__ 10:00:00:05:33:3c:65:e9   | 21:fd:00:05:33:3c:65:e9   |       fffc01 |     43753:00:51:25 |         0 |         3 |         1

 

UDS_02_HUJ4321C00

|__ Logged-In Host Table

    ||__ WW Node Name              | WW Port Name              |    Source ID |    Port Login Time |   Port ID |   Host ID |   Rel. ID

    ||__ 10:00:00:05:33:3c:ee:63   | 21:fd:00:05:33:3c:ee:63   |       fffc01 |     42776:04:29:56 |         0 |         3 |         1

 

Note: Drive Dump collected with TSM down, so only the “Fibre switch” is logged into the Drive

 

UDS_-12_HUJ4321BVD

|__ Logged-In Host Table

    ||__ WW Node Name              | WW Port Name              |    Source ID |    Port Login Time |   Port ID |   Host ID |   Rel. ID

    ||__ 20:00:00:24:ff:78:0e:7e   | 21:00:00:24:ff:78:0e:7e   |        10e00 |     18328:21:01:56 |         0 |         1 |         1

    ||__ 10:00:00:05:33:3c:65:e9   | 21:fd:00:05:33:3c:65:e9   |       fffc01 |     42040:01:49:41 |         0 |         2 |         1

    ||__ 20:00:00:24:ff:78:0e:82   | 21:00:00:24:ff:78:0e:82   |        10b00 |     16643:14:39:20 |         0 |         3 |         1

 

 

 

San Zoning 1:

 

alias: f1_MDC_node2_P3

                21:00:00:24:ff:78:0e:7e

 

zone:  f1_MDC_node2_2_f1_Qtape01_tapes

                21:00:00:24:ff:78:0e:7e

                50:03:08:c3:a2:55:a0:01

                50:03:08:c3:a2:55:a0:95

 

alias: f1_MDC_node1_P3

                21:00:00:24:ff:78:0e:82

 

zone:  f1_MDC_node1_2_f1_Qtape01_tapes

                21:00:00:24:ff:78:0e:82

                50:03:08:c3:a2:55:a0:01

                50:03:08:c3:a2:55:a0:95

 

 

We can see that only the MDCs and FC Switch is logged into the Drive, no concurrent Host is logged in.

Knowing that, we can continue and clear/remove the foreign SCSI Reservation.

 



Resolutions/workarounds/fixes:

 

-       Neither fs_scsi hidden option 46: Prout -> Release | Unregister will succeed on foreign Reservations

-      Clearing the reservation or trying to remove the registration – using sg3_utils - of a foreign host will fail with “persistent reserve out: transport: Host_status=0x11 is invalid”

-    Customer rebooted the AEL500 but the persisten reservation by the foreign host remains

 

 

 

 

After a Power Cycle of the library and recycle of TSM, all reservations were cleared and TSM is able to register and put its persisten reservation on the Drive.

 

[root@MDC-node1 ~]# fs_scsi -R
Device: /dev/sg80 (SN: F3A255A004)
-------------------------------------------------------------------------
persistent reserve IN read keys data:  length=16
          00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
          -- -- -- --  -- -- -- --  -- -- -- --  -- -- -- --
7FFF23A69270: 00 00 00 01  00 00 00 08  53 4e 00 e0  ed 41 82 b6
PR generation=0x1, Additional length=8
1 registered reservation key(s) follow:
   
0x534e00e0ed4182b6

-------------------------------------------------------------------------
persistent reserve IN read reservation data:  length=27
          00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
          -- -- -- --  -- -- -- --  -- -- -- --  -- -- -- --
7FFF23A69270: 00 00 00 01  00 00 00 10  53 4e 00 e0  ed 41 82 b6
7FFF23A69280: 00 00 00 00  00 03 00 00  36 2d 53 43
PR generation=0x1, Reservation follows:    Key=0x534e00e0ed4182b6

 

 

What we learn from this Case:

 

·        Check The Drive Logs, to see what Hosts are Logged In

·        To display the Registrated Hosts & Reservation on Drives you can use

-         fs_scsi –R

-         fs_scsi -> 36: Prin -> 0 - Read keys | 1 - Read reservation

-         sg_persist –r <dev>

 

·        TSM reservation key always starts with 0x534e

·        Persisten Reservation on i500 Drives can’t be remove by rebooting the library or resetting the drives. It requires either a hard power-cycle or re-seating the drives



This page was generated by the BrainKeeper Enterprise Wiki, © 2018