Space Management: How It Works, and How to Verify Settings in the Seer Configuration File

Overview

The information in this article was based on the information available at http://10.40.164.31/dxwiki/GalaxySoftware/SpaceManagementModes.

 

This article explains how the space management threshold may vary between different DXi models. It also discusses where you can find the space management settings.

 

IMPORTANT NOTE: This article is for internal use only.

 


GalaxySoftware/SpaceManagement Modes

The Space Manager Daemon (spaced) pulls the available disk space on the system every 15 seconds. It then issues BSEM (BigSky Event Manager) event notifications when it detects pre-defined thresholds. There are four space management modes that various components (dedupd, qbfsd, bpgc, and replicationd) respond to.

The contents of the /data/hurricane/conf/SpaceManagerStateMachine.conf file are different in various Galaxy releases.

Note: There is the text: Pre Galaxy 2.0 which denotes a difference in behavior between Galaxy 2.x and any version before 2.0.

Pre-Galaxy 2.0 Only: Truncation Mode

The system enters this mode when the used space exceeds a preset level of disk capacity, which originally was 70%. The purpose is to truncate enough deduped files to reclaim the space. The system exists from Truncation Mode when the used space falls below about 65%.

In additon:

 

Note: This information is a bit stale. Since PTR 7709, the truncation threshold has been different for various DXi models:

 

Throttle Mode

Note: Truncation only applies to Pre-Galaxy 2.0.

The system enters this mode at when used disk space reaches about roughly 95% of capacity. The actual threshold is dependent on the total capacity of the system and is calculated as follows:

 500 GB (Base) + 100 GB additional per every 10 TB capacity

Examples of the throttling threshold:

 

In this state:

 

The throttle delay starts at 16 ms and is doubled every 1 minute, until it reaches the maximum of 1024 ms. In this mode, BPGC is automatically triggered, regardless of the current Space Recalamation schedule. Truncation mode, as described above, will be active.

Some settings are different on specific models:

 

 

       <NormalState>false</NormalState>
       <ThrottleState>true</ThrottleState>
       <NospaceState>false</NospaceState>
       <StopIOState>false</StopIOState>

Critical Low Space Mode

Note: Truncation only applies to Pre Galaxy 2.0.

The system enters this mode when available space falls to 250 GB. In this state:

 

       <NormalState>false</NormalState>
       <ThrottleState>true</ThrottleState>
       <NospaceState>true</NospaceState>
       <StopIOState>false</StopIOState>

Stop I/O Mode

Note: Truncation only applies to Pre-Galaxy 2.0.


Stop I/O Mode is triggered when free disk space goes down to 10 GB. This is an "everything stops" situation. Ideally, the system should never come to this point. If it does, this means that there is a bug in one of the components responsible for controlling the disk usage, which if allowed to run further, will make disk usage 100%. Obviously, we do not want that and also for keeping the system usable - we will stop “all writes” possible. Read of raw data is allowed. For Pre Galaxy 2.0: read of dedupped-truncate will not be allowed. Essentially this mode is a diagnostic mode.

 

       <NormalState>false</NormalState>
       <ThrottleState>true</ThrottleState>
       <NospaceState>true</NospaceState>
       <StopIOState>true</StopIOState>

Ping-Pong Avoidance at Mode Boundaries 

Note: Truncation only applies to Pre-Galaxy 2.0.

To avoid ping-ponging between modes at mode boundaries due to fluctuations, Space Management uses a rising threshold and falling threshold scheme when entering and exiting a mode. In particular:

 

On Jaguar, the threshold is calculated in GBs of free space change, as follows: 


Space Management size measurements, and the capacities shown in the GUI, are based on a decimal scheme (EX: 1 KB = 1000 bytes, 1 GB = 1,000,000,000 bytes, 1 TB = 1000,000,000,000 bytes) .

 


Pre Galaxy 2.0 Seer Configuration File

[root@DXi-B1 ~]# cat /data/hurricane/conf/SpaceManagerStateMachine.conf

<?xml version="1.0"?>

<root>

<PersistentStateMachine>

<NormalState>true</NormalState>

<TruncateState>false</TruncateState>

<ThrottleState>false</ThrottleState>

<NospaceState>false</NospaceState>

<StopIOState>false</StopIOState>

<IngestThrottleStartTime]]>-1</IngestThrottleStartTime>

</PersistentStateMachine>

<SpaceManagementAlgorithm>

<HighWaterMarkGB>500</HighWaterMarkGB>

<LowWaterMarkPercent>70</LowWaterMarkPercent>

<HysteresisPercent>5</HysteresisPercent>

<CriticalLowSpaceGB>250</CriticalLowSpaceGB>

<StopIOGB>10</StopIOGB>

<NormalPollInterval>15</NormalPollInterval>

<ThrottlePollInterval>7</ThrottlePollInterval>

</SpaceManagementAlgorithm>

</root>

 


Galaxy 2.0 Seer Configuration File

On a Galaxy system, the states/modes are persistently stored in /data/hurricane/conf/SpaceManagerStateMachine.conf. There is a SpaceManagerStateMachine.conf file for each of these different platforms: DXi8500, DXi0, CS800, DXi7500UL, DXi7500G, DXi6800, DXi4601, DXi8500-3T, DXi7500, DXi6500, DXi4500-4510, DXi4500-4520, DXi8500E, and DXi2500.

The example below is for a DXi0 platform.

<?xml version="1.0"?>

<root>

<PersistentStateMachine>

<NormalState>true</NormalState>

<ThrottleState>false</ThrottleState>

<NospaceState>false</NospaceState>

<StopIOState>false</StopIOState>

<IngestThrottleStartTime>-1</IngestThrottleStartTime>

</PersistentStateMachine>

<SpaceManagementAlgorithm>

<HighWaterMarkGB>5</HighWaterMarkGBg>

<HysteresisPercent>1</HysteresisPercent>

<CriticalLowSpaceGB>2</CriticalLowSpaceGB>

<StopIOGB>1</StopIOGB>

<NormalPollInterval>15</NormalPollInterval>

<ThrottlePollInterval>7</ThrottlePollInterval>

<BPLowWaterMarkPercent>85</BPLowWaterMarkPercent>

<BPHighWaterMarkPercent>95</BPHighWaterMarkPercent>

<MetaDataLowWaterMarkPercent>85</MetaDataLowWaterMarkPercent>

<IngestNonDeDupRateBytesPerHour>0</IngestNonDeDupRateBytesPerHour>

</SpaceManagementAlgorithm>

</root>

 


Galaxy 2.2 Space Manager States and Actions

 

State

Internal Setting Names

Actions

Normal

SM_NORMAL_STATE

Clear these Space Manager states: Throttle, NOSPACE (another name for Critical Low Space) and StopIO

Note: In 2.2, when a manual GC run is invoked via the GUI or syscli, only stage 2 (candidates for deletion) and stage 3 (deletions) are run.

Throttle

SM_THROTTLE_STATE

Start bpgc stages 1, 2, 3 and 4.

NOSPACE or Critical Low Space

SM_THROTTLE_STATE and SM_NOSPACE_STATE

Start bpgc stages 1, 2, 3 and 4.

Also, replicationd detects SM_NOSPACE_STATE and will pause replication.

Stop I/O

SM_THROTTLE_STATE and SM_NOSPACE_STATE and SM_STOPIO_STATE

Start bpgc stages 1, 2, 3 and 4.

Also. replicationd detects SM_NOSPACE_STATE and will pause replication.

 

 


Galaxy 2.2 Test Matrix

PF -- Physical Free

HW -- High Water (var_high_water_mark)

CR -- Critical Reserve (m_critical_reserve)

SI -- Stop IO mark (m_ceiling_gap)

Assumptions are:

  a. HW > CR > SI

  For a DXi0: HW=5GB, CR=2GB, SI=1GB

Condition

Expected State

PF > HW

Normal

PF <= HW; PF > CR; PF > SI

Throttle

PF < HW; PF < CR; PF > SI

NoSpace

PF < HW; PF < CR; PF <= SI

StopIO

 Obtain a copy of the binary datagen_x64 from here:

   cd /var/tmp
   ftp coredump.quantum.com
   ftp> Name: anonymous
   ftp> cd /test/datagen_dir
   ftp get datagen_x64
   ftp> bye

NOTE: HW, CR, and SI are given in whole GB numbers. The output from 'df -h' is given in rounded up GB.

For example, if HW=5, the amount of available free disk space that is required to produce HW < 5 would be less than or equal to 4.0GB.  An available free disk space of 4.1GB up to 5.0GB will produce a PF = 4GB.  An available free disk space of 5.1GB will produce a PF = 5GB.

     The output from 'df -h' shows rounded up GB numbers.

     For example:

'df' shows Available=582067841K blocks, which is really 59603746816 bytes. 

'df -h' shows this as Avail=56G.  59603746816 bytes is really "55.5 GB".

A. Use these steps to create PF < [some number of GB]

   1.  For THROTTLE condition:

       For example, if 'df -h' shows 'qfs' to have 56GB Avail, use -w 51g to create PF = 4GB.  Rule of thumb is to subtract '5' from the 'Avail' number to create PF < HW.  For a DXi0, HW=5.

        /usr/cvfs/bin/cvmkfile -w 51g /snfs/tmp/lowspace_fill_cvmkfile

   2.  For NOSPACE condition:

       For example, if 'df -h' shows 'qfs' to have 56GB Avail, use -w 52g to create PF = 2GB.  Rule of thumb is to subtract '4' from the 'Avail' number to create PF < CR.  For a DXi0 CR=2.
       
        /usr/cvfs/bin/cvmkfile -w 52g /snfs/tmp/lowspace_fill_cvmkfile

   3.  For STOPIO condition:

       For example, if 'df -h' shows 'qfs' to have 56GB Avail, use -w 54g to create PF = 1GB.  Rule of thumb is to subtract '1' from the 'Avail' number to create PF <= SI.  For a DXi0 SI=1.
       
        /usr/cvfs/bin/cvmkfile -w 54g /snfs/tmp/lowspace_fill_cvmkfile

1. PF > HW Normal

   a. Verify that the space manager state shows NORMAL in the tsunami.log.

2. PF < HW; PF < CR; PF > SI NoSpace

   a. Use above step in (B2) to create NOSPACE condition.
       This now creates PF < HW and PF < CR and PF > SI.
   b. Verify that the space manager state shows NOSPACE in the tsunami.log.
   c. rm -f /snfs/tmp/lowspace_fill_cvmkfile
   d. Verify that the space manager state shows NORMAL in the tsunami.log.

3. PF < HW; PF < CR; PF <= SI StopIO

   a. Use above step in (B3) to create STOPIO condition.
      This now creates PF < HW and PF < CR and PF <= SI.
   b. Verify that the space manager state shows STOPIO in the tsunami.log.
   c. rm -f /snfs/tmp/lowspace_fill_cvmkfile
   d. Verify that the space manager state shows NORMAL in the tsunami.log.

4. PF <= HW; PF > CR; PF > SI Throttle

   a. Use above step in (B1) to create THROTTLE condition.
       This now creates PF <= HW and PF > CR and PF > SI.
   b. Verify that the space manager state shows THROTTLE in the tsunami.log.
   c. rm -f /snfs/tmp/lowspace_fill_cvmkfile
   d. Verify that the space manager state shows NORMAL in the tsunami.log.

 

 



This page was generated by the BrainKeeper Enterprise Wiki, © 2018