Kdump: Identifying a Kernel panic/oops/crash then Kdump: Gather with .bz2 Option

Overview

This topic describes the steps that need to take place if a Kernel panic/oops/crash then Kdump is observed.  First identify the issue from the customer GUI if applicable, customer DXi from CLI or reviewing the collect log for a situation possibly encountered. Gather the Kdump using the tar jcvf option to compress the Kdump directory then transfer to Quantum.

 


Symptom

Identifying a Kernel panic/oops/crash then Kdump. If identified, then the Kdump is needed for root cause analysis to understand the condition of the Kernel panic which causes the Blockpool to uncleanly shut down and leads to a Blockpool Verify from journal in most cases.

 


Cause

Software, hardware, and customer environment issues can lead to a Kernel panic/oops/crash then Kdump. This type of situation will create additional logging information including what is in memory at the time of the Kernel panic.

 


Resolution

Identifying a Kernel panic/oops/crash then Kdump. The Overview will provide basic troubleshooting then gather the Kdump for further analysis. Steps below:

 

Steps 1 – 4 below: Identification from customer GUI if applicable, customer DXi from CLI, review the collect log, and gather the Kdump using the tar jcvf option.

Step 1:  Reviewing the customer GUI if access is permitted: 

A:  Login in to the GUI.

 

B:  On the Home Page, click Ticket and review the information.

 

C:  Search under “Details” for “I/O Server KDUMP : Software fault” and review the Ticket for additional information and time of the occurrence.

 

Click on the Ticket Number and review “Summary:” and “Opened At: Tue Oct 22 2013 – 04:22:03 PM PDT”

 

Step 2: Reviewing the customer DXi from CLI:

A:  Log into to the DXi from CLI via ssh or any other type of methods (Examples below).

 

B:  # grep 'Kernel panic/oops/crash has happened' /usr/adic/SRVCLOG/logs/srvcLog.hist

Oct 22 16:22:03 2013 1 UNKNOWN 1 KDUMP UNKNOWN 1129581492 8 Kernel panic/oops/crash has happened   Ticket creation time: 10/22 16:22:03 PDT

 

C:  # grep KDUMP /var/log/messages*

Oct 22 16:03:46 SES4520DXi67 KDUMP: CRIT : KERNEL PANIC/OOPS/CRASH has occurred

Oct 22 16:03:47 SES4520DXi67 KDUMP: INFO :     *********begin  stack traces ********

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #0 [ffff810540c4bdf0] crash_kexec at ffffffff800ae9f2

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #1 [ffff810540c4beb0] sysrq_handle_crashdump at ffffffff801b9324

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #2 [ffff810540c4bec0] __handle_sysrq at ffffffff801b90d7

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #3 [ffff810540c4bf00] write_sysrq_trigger at ffffffff8010955e

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #4 [ffff810540c4bf10] vfs_write at ffffffff800168ea

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #5 [ffff810540c4bf40] sys_write at ffffffff800171b9

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #6 [ffff810540c4bf80] system_call at ffffffff8005e116

Oct 22 16:03:51 SES4520DXi67 KDUMP:          RIP: 00002b7878953f80  RSP: 00007fff762188b0  RFLAGS: 00010246

<<< Removed extra information and Continued >>>

Oct 22 16:03:51 SES4520DXi67 KDUMP: INFO :     *********end  stack traces ********

Oct 22 16:03:51 SES4520DXi67 KDUMP: INFO : Copying kernel crash dump

Oct 22 16:04:29 SES4520DXi67 KDUMP: INFO : Successfully collected kernel core by excluding unwanted pages!

Oct 22 16:04:33 SES4520DXi67 KDUMP: INFO : Kernel crash dump copied successfully

Oct 22 16:04:33 SES4520DXi67 KDUMP: INFO : Rebooting the system.....

Oct 22 16:18:37 SES4520DXi67 KDUMP: INFO : waiting to mount '/snfs' for (5 min)

Oct 22 16:22:03 SES4520DXi67 KDUMP: WARN : There were no email ids configured through GUI to send email notification

Oct 22 16:22:03 SES4520DXi67 srvclogcli: E0000(1)<1129581492>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 1 VINST: KDUMP VPINST: UNKNOWN EVENT: 8 TEXT: Kernel panic/oops/crash has happened   Ticket creation time: 10/22 16:22:03 PDT

Oct 22 16:22:41 SES4520DXi67 KDUMP: INFO : Uncompressed kernel dump was saved @ '/snfs/Kdumps/kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s'

Oct 22 16:26:01 SES4520DXi67 KDUMP: WARN : There were no email ids configured through GUI to send email notification

 

Step 3:  Reviewing the collect log for situation encountered (Examples below):

A:  Download and obtain the collect log (System Diag File) for review.  When the collect log is unzipped then search for the following.

 

B:  # grep 'Kernel panic/oops/crash has happened' scratch/collect/node1-collection/app-info/srvcLog.hist

Oct 22 16:22:03 2013 1 UNKNOWN 1 KDUMP UNKNOWN 1129581492 8 Kernel panic/oops/crash has happened   Ticket creation time: 10/22 16:22:03 PDT

 

C:  # grep KDUMP scratch/collect/node1-collection/os-info/messages*

Oct 22 16:03:46 SES4520DXi67 KDUMP: CRIT : KERNEL PANIC/OOPS/CRASH has occurred

Oct 22 16:03:47 SES4520DXi67 KDUMP: INFO :     *********begin  stack traces ********

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #0 [ffff810540c4bdf0] crash_kexec at ffffffff800ae9f2

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #1 [ffff810540c4beb0] sysrq_handle_crashdump at ffffffff801b9324

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #2 [ffff810540c4bec0] __handle_sysrq at ffffffff801b90d7

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #3 [ffff810540c4bf00] write_sysrq_trigger at ffffffff8010955e

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #4 [ffff810540c4bf10] vfs_write at ffffffff800168ea

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #5 [ffff810540c4bf40] sys_write at ffffffff800171b9

Oct 22 16:03:51 SES4520DXi67 KDUMP:          #6 [ffff810540c4bf80] system_call at ffffffff8005e116

Oct 22 16:03:51 SES4520DXi67 KDUMP:          RIP: 00002b7878953f80  RSP: 00007fff762188b0  RFLAGS: 00010246

<<< Removed extra information and Continued >>>

Oct 22 16:03:51 SES4520DXi67 KDUMP: INFO :     *********end  stack traces ********

Oct 22 16:03:51 SES4520DXi67 KDUMP: INFO : Copying kernel crash dump

Oct 22 16:04:29 SES4520DXi67 KDUMP: INFO : Successfully collected kernel core by excluding unwanted pages!

Oct 22 16:04:33 SES4520DXi67 KDUMP: INFO : Kernel crash dump copied successfully

Oct 22 16:04:33 SES4520DXi67 KDUMP: INFO : Rebooting the system.....

Oct 22 16:18:37 SES4520DXi67 KDUMP: INFO : waiting to mount '/snfs' for (5 min)

Oct 22 16:22:03 SES4520DXi67 KDUMP: WARN : There were no email ids configured through GUI to send email notification

Oct 22 16:22:03 SES4520DXi67 srvclogcli: E0000(1)<1129581492>:SRVCLOG RCOMP: 1 RINST: UNKNOWN VCOMP: 1 VINST: KDUMP VPINST: UNKNOWN EVENT: 8 TEXT: Kernel panic/oops/crash has happened   Ticket creation time: 10/22 16:22:03 PDT

Oct 22 16:22:41 SES4520DXi67 KDUMP: INFO : Uncompressed kernel dump was saved @ '/snfs/Kdumps/kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s'

 

Step 4:  Gathering the Kdump using the tar jcvf option to compress via CLI from DXi then transferring to Quantum:

A:  Change directory to the location notification in the messages file (Example below):

# cd /snfs/Kdumps

 

# ls -al

total 32960

drwxr-xr-x  3 root root 2055 Oct 22 16:26 .

drwxrwxrwx 24 root root 2051 Oct 22 16:18 ..

drwxr-xr-x  2 root root 2056 Oct 22 16:22 kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s

-r--r--r--  1 root root   85 Oct 22 16:26 .__latestDump.txt

 

# ls -l kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s

total 7598144

-rw-r--r-- 1 root root      66125 Oct 22 16:04 config-2.6.18-164.15.1.qtm.4-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

-rw-r--r-- 1 root root       1314 Oct 22 16:22 dumpstatus-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

-rw-r--r-- 1 root root  194989665 Oct 22 16:04 Dxitsunami.log-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

-rw-r--r-- 1 root root    1244207 Oct 22 16:04 System.map-2.6.18-164.15.1.qtm.4-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

-rw-r--r-- 1 root root    7476424 Oct 22 16:04 varlogmessages-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

-rw------- 1 root root 6852432656 Oct 22 16:04 vmcore-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

-rwxr-xr-x 1 root root   59228432 Oct 22 16:04 vmlinux-2.6.18-164.15.1.qtm.4_BVFK5M1_2013-10-22_16h03m46s

 

B:  # tar jcvf kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2 kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s (This will take several minutes and depends on the size of the memory in the DXi)

 

C:  # ls -al kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2

-rw-rw-r-- 1 root root 823446728 Oct 24 11:15 kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2

 

D:  # /usr/bin/md5sum kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2 > md5sum_kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2

 

E:  # cat md5sum_kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2

c69a91eded8c0630f4f6a118fb77bd85  kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2

 

F:  Now both kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2 and md5sum_kdump_BVFK5M1_SES4520DXi67_2013-10-22_16h03m46s.tar.bz2 can be transferred to gps.quantum.com or your preferred method (NOTE: Use binary mode if FTPing).

 


 



This page was generated by the BrainKeeper Enterprise Wiki, © 2018