SR3508190 Failed to allocate stripe group space, metadata full

SR Information: 3508190 GWDG / Max Planck

 

Problem Description: Failed to allocate stripe group space, metadata full

 

Product / Software Version:

 

MDC:

SNMS 4.7.0.1

SuSE 10 SP3 64 bit (2.6.16.60-0.54.5)

 

 Overview

The Customer is trying to add another data stripegroup to an existing filesystem with 2 metadata stripes, one was created when the filesystem was created and the other added some time later. (SG0 and SG7).

When cvupdatefs is run it fails claiming *Failed to allocate stripe group space, metadata full* despite SG7 having 67GB free metadata space. (And 5GB free on SG0).

 

 

Symptoms & Identifying the problem

 

## 1 ## Log Review:

 

cvupdatefs

 
The following changes have been detected in the configuration Please review these changes carefully.

 

Stripe Group Name  Stripe Status  MetaData   Journal

=================  =============  ========   =======

sg0                No Change      No Change  No Change

sg1                No Change                         

sg2                No Change                         

sg3                No Change                         

sg4                No Change                         

sg5                No Change                         

sg6                No Change                         

sg7                No Change      No Change          

sg8                Create                            

 

 

This will modify the file system *UNI05*.

Are you sure you want to continue? [y/N] Flushing journal entries...  done Initializing stripe group sg8 information... *Fatal*: Failed to allocate stripe group space, metadata full

*Error*: ERROR

 

- validated config file -> OK

- validated free space MetaData SG -> OK

- changed AllocationStrategy from round -> balance even though no exact match for Bug 36127

- checked cvlabel output -> OK

 

cvadmin - show output for MD SG

 

Stripe Group 0 [sg0Status:Up,MetaData,Journal,Exclusive

  Total Blocks:3199710 (97.65 GB)  Reserved:0 (0.00 B) Free:167552 (5.11 GB) (5%)

Stripe Group 7 [sg7Status:Up,MetaData,Exclusive

  Total Blocks:3199710 (97.65 GB)  Reserved:0 (0.00 B) Free:2221442 (67.79 GB) (69%)

 

 

 

## 2 ## Troubleshooting:

 

We decided to replicate the failure again and collect a strace this time.

 

The trace shows its failing while checking for bitmap space in SG7 which we know has sufficient free space.


 write(1, *(Debug): bm_find_space: Reading *..., 160(Debug): bm_find_space: Reading bitmap start blk 0x2d1532 end blk0x2d153e read blk 0x2d1532
      read_block_hint 0x0 sg_hint 7 stripe_group 7 alloc_blocks 0x7a2) = 160
   write(1, *\n*, 1)                       = 1
   pread(5, *\37\370\0\0\0\0?\361\367\374\0\0\0\0\0\0\0\7\303\300\0*..., 32768, 96815611904) = 32768
   write(1, *(Debug): bm_find_space: 3Resetin*..., 108(Debug): bm_find_space: 3Reseting start and end bits from 0x3 and 0xc to -1 while using read block 0x2d1532) = 108
   write(1, *\n*, 1)                       = 1
,,,
   write(1, *(Debug): bm_find_space: 3Resetin*..., 118(Debug): bm_find_space: 3Reseting start and end bits from 0x30d2d6 and 0x30d2dd to -1 while using read block 0x2d153e) = 118
   write(1, *\n*, 1)                       = 1
   write(1, *(Debug): bm_find_space: Reached *..., 90(Debug): bm_find_space: Reached end of bit map, read_block 0x2d153f bm_end_block 0x2d153e) = 90write(1, *\n*, 1)                       = 1
   write(1, *(Debug): Inode 0x56b3e7: Alloc 0*..., 72(Debug): Inode 0x56b3e7: Alloc 0x7a2 blocks at block 0xffffffffffffffff) = 72
   write(4, *(Debug): Inode 0x56b3e7: Alloc 0*..., 72) = 72
   write(1, **Fatal*: Failed to allocate stri*..., 62*Fatal*: Failed to allocate stripe group space, metadata full

 

So how come? Reviewing the code within cvfsupdatefs.c we can see the calculation for the required block comes from blks_to_alloc

and being processed by bm_find_space().

 

In our case we failed to allocate space. Lets check how much space we require. We know the formula ( see code above ) so lets

Create the following C-file for the calculation and compile it with the GNU compiler.

 

#include <stdio.h>

#include <sys/types.h>

#define FsBlockSize 0x8000 /* From Config File or cvfsdb show icb */

#define RNDUPFS(a) (((a) + (FsBlockSize - 1)) & ~(FsBlockSize -1)) /* From globals.h */

 

int main(void)

{

              unsigned int  d_blocksize = FsBlockSize;

              unsigned int  blks_to_alloc = 0;

              unsigned int  total_blocks = 511982592; /* From cvfsck -nv */

 

 

              blks_to_alloc = RNDUPFS(total_blocks >> 3) / d_blocksize;

              printf("Need 0x%x contiguous blocks\n", blks_to_alloc);

 

              return;

}

 

For Example:

 

[root@Leipzig]# gcc -o block_calc block_calc.c

[root@Leipzig]# ./block_calc

Need 0x7a2 contiguous blocks

 

So 1954 contiguous blocks are required. Lets check Free Space Fragmentation for SG7 by running “cvfsck –f”

 

 

-- Free Block Fragmentation Analysis - Stripe Group "sg7" --

 

Pct. (sum)   Chunk Size   Chunk Count

-----------  ----------   -----------

<1% ( <1%)           1            1

<1% ( <1%)           3         1310

<1% ( <1%)           4         1809

<1% ( <1%)           5          968

<1% ( 69%)         552            1

<1% ( 69%)         563            1

<1% ( 69%)         570            1

<1% ( 69%)         576            1

 

 We can see that the MD StripeGroup is heavily fragmented in this case 1954 blocks were required but the largest chunk in "cvfsck -f" was only 576 blocks despite having 67GB free. The excessive fragmentation was traced to the Cutomer workflow frequently creating and deleting directories.

 

Resolutions/workarounds/fixes:

 

  • To workaround this issue, the  Filesystem has been expanded by a 100G MD Luncvfsck -f revealed it had more than 3Million contiguous blocks.
  • Another attempt to expand the Data SG-Expansion suceeded. (18TB).
  • CR54767 has been open to improve error messaging and address fragmentation caused by frequent directory creation/deletion.

 

 

What we learn from this case:

- Adding stripe groups required contiguous metadata space

- MD SG can’t be defragmented ( at least 4.x )

- Use the Source Luke

- Review Steve Coles Debug Rules

 

 

 

 



This page was generated by the BrainKeeper Enterprise Wiki, © 2018