How to Resolve Kernel dmesg Errors: ext4_journal_check_start

Overview

SR Information: SR3690966, SR3749066 and QUANTUM-1348

 

 

Product / Software Version: This is issue occurs on 6 TB HDDs in S30 Storage Nodes (NSS1121_72) and on all Lattus software versions.

 

 

Problem Description: Three disks (sda, sdd and sdg) were displaying file system errors and causing "Kernel dmesg errors detected" events. Example:

 

Kernel dmesg errors detected
EXT4-fs (sdg1): error count: 2
EXT4-fs (sda1): error count: 4
EXT4-fs (sdg1): initial error at 1447102452: ext4_journal_check_start:56
EXT4-fs (sda1): initial error at 1447103176: ext4_journal_check_start:56
EXT4-fs (sda1): last error at 1466158419: ext4_find_entry:1309: inode 9834481
EXT4-fs (sdg1): last error at 1447102452: ext4_journal_check_start:56
EXT4-fs (sdd1): error count: 5
EXT4-fs (sdd1): initial error at 1447102909: ext4_read_inode_bitmap:175
EXT4-fs (sdd1): last error at 1447102940: ext4_journal_check_start:56
Possible solution:
Investigate the root cause of the kernel dmesg errors

Severity: ERROR
Machine: storage003

 

HGST did not have recommendations beyond running an fsck, which was already attempted.

 

 

 


Procedure

Take the following steps to resolve the file system errors:

 

  1. Restart the storage daemon if the file system is busy:

root@storage003:~# lsof /dev/sdd1

 

COMMAND  PID      USER   FD   TYPE DEVICE SIZE/OFF      NODE NAME
dss.bin 2048 dssdaemon  668w   REG   8,49      518 359944105 /mnt/dss/dss11/blockstore/ns_v2/0/1/_tlog/21.tlog
dss.bin 2048 dssdaemon  859w   REG   8,49 34218981 359957403 /mnt/dss/dss11/blockstore/ns_v2/0/0/_tlog/192.tlog

 

root@storage003:~# ps -ef | grep 2048

 

1002      2048     1  0 Jun23 ?        23:08:45 /opt/qbase3/bin/dss -d --storagedaemon /opt/qbase3/cfg/dss/storagedaemons/ddc6b962-2c2e-4c27-80ed-2aab0de206c1.cfg
root     23561 21925  0 13:18 pts/0    00:00:00 grep --color=auto 2048

 

root@storage003:~# qshell -c "q.dss.storagedaemons.restartOne('ddc6b962-2c2e-4c27-80ed-2aab0de206c1')"

 

  1. Unmount the file system:

root@storage003:~# umount /dev/sdd1

 

  1. Run fsck to automatically correct errors:

root@storage003:~# fsck -a /dev/sdd1

 

fsck from util-linux 2.20.1
dss11 contains a file system with errors, check forced.
dss11: 11899067/366288896 files (0.0% non-contiguous), 838945709/1465130385 blocks

 

  1. Run fsck again to verify that the file system is 'clean':
root@storage003:~# fsck -a /dev/sdd1
fsck from util-linux 2.20.1
dss11: clean, 11899067/366288896 files, 838945709/1465130385 blocks

 

  1. Mount the file system:

root@storage003:~# mount /dev/sdd1

 

 


 



This page was generated by the BrainKeeper Enterprise Wiki, © 2018