Space Management: How It Works, and How to Verify Settings in the Seer Configuration File |
The information in this article was based on the information available at http://10.40.164.31/dxwiki/GalaxySoftware/SpaceManagementModes.
This article explains how the space management threshold may vary between different DXi models. It also discusses where you can find the space management settings.
IMPORTANT NOTE: This article is for internal use only.
GalaxySoftware/SpaceManagement Modes
The Space Manager Daemon (spaced) pulls the available disk space on the system every 15 seconds. It then issues BSEM (BigSky Event Manager) event notifications when it detects pre-defined thresholds. There are four space management modes that various components (dedupd, qbfsd, bpgc, and replicationd) respond to.
The contents of the /data/hurricane/conf/SpaceManagerStateMachine.conf file are different in various Galaxy releases.
Note: There is the text: Pre Galaxy 2.0 which denotes a difference in behavior between Galaxy 2.x and any version before 2.0.
The system enters this mode when the used space exceeds a preset level of disk capacity, which originally was 70%. The purpose is to truncate enough deduped files to reclaim the space. The system exists from Truncation Mode when the used space falls below about 65%.
In additon:
Note: This information is a bit stale. Since PTR 7709, the truncation threshold has been different for various DXi models:
Note: Truncation only applies to Pre-Galaxy 2.0.
The system enters this mode at when used disk space reaches about roughly 95% of capacity. The actual threshold is dependent on the total capacity of the system and is calculated as follows:
500 GB (Base) + 100 GB additional per every 10 TB capacity
Examples of the throttling threshold:
In this state:
The throttle delay starts at 16 ms and is doubled every 1 minute, until it reaches the maximum of 1024 ms. In this mode, BPGC is automatically triggered, regardless of the current Space Recalamation schedule. Truncation mode, as described above, will be active.
Some settings are different on specific models:
<NormalState>false</NormalState> <ThrottleState>true</ThrottleState> <NospaceState>false</NospaceState> <StopIOState>false</StopIOState>
Note: Truncation only applies to Pre Galaxy 2.0.
The system enters this mode when available space falls to 250 GB. In this state:
<NormalState>false</NormalState> <ThrottleState>true</ThrottleState> <NospaceState>true</NospaceState> <StopIOState>false</StopIOState>
Note: Truncation only applies to Pre-Galaxy 2.0.
Stop I/O Mode is triggered when free disk space goes down to 10 GB. This is an "everything stops" situation. Ideally, the system should never come to this point. If it does, this means that there is a bug in one of the components responsible for controlling the disk usage, which if allowed to run further, will make disk usage 100%. Obviously, we do not want that and also for keeping the system usable - we will stop “all writes” possible. Read of raw data is allowed. For Pre Galaxy 2.0: read of dedupped-truncate will not be allowed. Essentially this mode is a diagnostic mode.
<NormalState>false</NormalState> <ThrottleState>true</ThrottleState> <NospaceState>true</NospaceState> <StopIOState>true</StopIOState>
Note: Truncation only applies to Pre-Galaxy 2.0.
To avoid ping-ponging between modes at mode boundaries due to fluctuations, Space Management uses a rising threshold and falling threshold scheme when entering and exiting a mode. In particular:
On Jaguar, the threshold is calculated in GBs of free space change, as follows:
Space Management size measurements, and the capacities shown in the GUI, are based on a decimal scheme (EX: 1 KB = 1000 bytes, 1 GB = 1,000,000,000 bytes, 1 TB = 1000,000,000,000 bytes) .
[root@DXi-B1 ~]# cat /data/hurricane/conf/SpaceManagerStateMachine.conf
<?xml version="1.0"?>
<root>
<PersistentStateMachine>
<NormalState>true</NormalState>
<TruncateState>false</TruncateState>
<ThrottleState>false</ThrottleState>
<NospaceState>false</NospaceState>
<StopIOState>false</StopIOState>
<IngestThrottleStartTime]]>-1</IngestThrottleStartTime>
</PersistentStateMachine>
<SpaceManagementAlgorithm>
<HighWaterMarkGB>500</HighWaterMarkGB>
<LowWaterMarkPercent>70</LowWaterMarkPercent>
<HysteresisPercent>5</HysteresisPercent>
<CriticalLowSpaceGB>250</CriticalLowSpaceGB>
<StopIOGB>10</StopIOGB>
<NormalPollInterval>15</NormalPollInterval>
<ThrottlePollInterval>7</ThrottlePollInterval>
</SpaceManagementAlgorithm>
</root>
On a Galaxy system, the states/modes are persistently stored in /data/hurricane/conf/SpaceManagerStateMachine.conf. There is a SpaceManagerStateMachine.conf file for each of these different platforms: DXi8500, DXi0, CS800, DXi7500UL, DXi7500G, DXi6800, DXi4601, DXi8500-3T, DXi7500, DXi6500, DXi4500-4510, DXi4500-4520, DXi8500E, and DXi2500.
The example below is for a DXi0 platform.
<?xml version="1.0"?>
<root>
<PersistentStateMachine>
<NormalState>true</NormalState>
<ThrottleState>false</ThrottleState>
<NospaceState>false</NospaceState>
<StopIOState>false</StopIOState>
<IngestThrottleStartTime>-1</IngestThrottleStartTime>
</PersistentStateMachine>
<SpaceManagementAlgorithm>
<HighWaterMarkGB>5</HighWaterMarkGBg>
<HysteresisPercent>1</HysteresisPercent>
<CriticalLowSpaceGB>2</CriticalLowSpaceGB>
<StopIOGB>1</StopIOGB>
<NormalPollInterval>15</NormalPollInterval>
<ThrottlePollInterval>7</ThrottlePollInterval>
<BPLowWaterMarkPercent>85</BPLowWaterMarkPercent>
<BPHighWaterMarkPercent>95</BPHighWaterMarkPercent>
<MetaDataLowWaterMarkPercent>85</MetaDataLowWaterMarkPercent>
<IngestNonDeDupRateBytesPerHour>0</IngestNonDeDupRateBytesPerHour>
</SpaceManagementAlgorithm>
</root>
State |
Internal Setting Names |
Actions |
Normal |
SM_NORMAL_STATE |
Clear these Space Manager states: Throttle, NOSPACE (another name for Critical Low Space) and StopIO Note: In 2.2, when a manual GC run is invoked via the GUI or syscli, only stage 2 (candidates for deletion) and stage 3 (deletions) are run. |
Throttle |
SM_THROTTLE_STATE |
Start bpgc stages 1, 2, 3 and 4. |
NOSPACE or Critical Low Space |
SM_THROTTLE_STATE and SM_NOSPACE_STATE |
Start bpgc stages 1, 2, 3 and 4. Also, replicationd detects SM_NOSPACE_STATE and will pause replication. |
Stop I/O |
SM_THROTTLE_STATE and SM_NOSPACE_STATE and SM_STOPIO_STATE |
Start bpgc stages 1, 2, 3 and 4. Also. replicationd detects SM_NOSPACE_STATE and will pause replication. |
PF -- Physical Free
HW -- High Water (var_high_water_mark)
CR -- Critical Reserve (m_critical_reserve)
SI -- Stop IO mark (m_ceiling_gap)
Assumptions are:
a. HW > CR > SI
For a DXi0: HW=5GB, CR=2GB, SI=1GB
Condition |
Expected State |
PF > HW |
Normal |
PF <= HW; PF > CR; PF > SI |
Throttle |
PF < HW; PF < CR; PF > SI |
NoSpace |
PF < HW; PF < CR; PF <= SI |
StopIO |
Obtain a copy of the binary datagen_x64 from here:
cd /var/tmp
ftp coredump.quantum.com
ftp> Name: anonymous
ftp> cd /test/datagen_dir
ftp get datagen_x64
ftp> bye
NOTE: HW, CR, and SI are given in whole GB numbers. The output from 'df -h' is given in rounded up GB.
For example, if HW=5, the amount of available free disk space that is required to produce HW < 5 would be less than or equal to 4.0GB. An available free disk space of 4.1GB up to 5.0GB will produce a PF = 4GB. An available free disk space of 5.1GB will produce a PF = 5GB.
The output from 'df -h' shows rounded up GB numbers.
For example:
'df' shows Available=582067841K blocks, which is really 59603746816 bytes.
'df -h' shows this as Avail=56G. 59603746816 bytes is really "55.5 GB".
A. Use these steps to create PF < [some number of GB]
1. For THROTTLE condition:
For example, if 'df -h' shows 'qfs' to have 56GB Avail, use -w 51g to create PF = 4GB. Rule of thumb is to subtract '5' from the 'Avail' number to create PF < HW. For a DXi0, HW=5.
/usr/cvfs/bin/cvmkfile -w 51g /snfs/tmp/lowspace_fill_cvmkfile
2. For NOSPACE condition:
For example, if 'df -h' shows 'qfs' to have 56GB Avail, use -w 52g to create PF = 2GB. Rule of thumb is to subtract '4' from the 'Avail' number to create PF < CR. For a DXi0 CR=2.
/usr/cvfs/bin/cvmkfile -w 52g /snfs/tmp/lowspace_fill_cvmkfile
3. For STOPIO condition:
For example, if 'df -h' shows 'qfs' to have 56GB Avail, use -w 54g to create PF = 1GB. Rule of thumb is to subtract '1' from the 'Avail' number to create PF <= SI. For a DXi0 SI=1.
/usr/cvfs/bin/cvmkfile -w 54g /snfs/tmp/lowspace_fill_cvmkfile
1. PF > HW Normal
a. Verify that the space manager state shows NORMAL in the tsunami.log.
2. PF < HW; PF < CR; PF > SI NoSpace
a. Use above step in (B2) to create NOSPACE condition.
This now creates PF < HW and PF < CR and PF > SI.
b. Verify that the space manager state shows NOSPACE in the tsunami.log.
c. rm -f /snfs/tmp/lowspace_fill_cvmkfile
d. Verify that the space manager state shows NORMAL in the tsunami.log.
3. PF < HW; PF < CR; PF <= SI StopIO
a. Use above step in (B3) to create STOPIO condition.
This now creates PF < HW and PF < CR and PF <= SI.
b. Verify that the space manager state shows STOPIO in the tsunami.log.
c. rm -f /snfs/tmp/lowspace_fill_cvmkfile
d. Verify that the space manager state shows NORMAL in the tsunami.log.
4. PF <= HW; PF > CR; PF > SI Throttle
a. Use above step in (B1) to create THROTTLE condition.
This now creates PF <= HW and PF > CR and PF > SI.
b. Verify that the space manager state shows THROTTLE in the tsunami.log.
c. rm -f /snfs/tmp/lowspace_fill_cvmkfile
d. Verify that the space manager state shows NORMAL in the tsunami.log.
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |