Beginning with StorNext 6, the Quality of Service Bandwidth Management feature provides a way to spread I/O bandwidth among a number of clients in a controlled and configurable manner.
This feature can be used to manage I/O bandwidth usage by Distributed Data Movers (DDM) by configuring the DDM clients as lower priority than other clients. Other clients are monitored using the client qustats, and the results are reported back to the FSM. The FSM uses that information to throttle DDM activity when clients are active, and to allow greater usage when client activity is low. The non-DDM clients are referred to as non-regulated clients, as there is no throttling of I/O on those clients.
This feature can also be used to manage all clients accessing a file system. Clients that are not configured explicitly are allocated bandwidth using default values. The classes are:
- First Come: Used to prioritize clients actively using bandwidth above new clients.
- Fair Share: Used to share I/O bandwidth equally or proportionally between identified clients.
- Low Share: Used in conjunction with the main classes, First Come and Fair Share.
To configure QBM, you must specify the maximum I/O bandwidth for each stripe group. Tools are supplied to assist with measuring the bandwidth.
QBM is best thought of as allowing for three general strategies:
- Mover Strategy: This strategy throttles some clients (DDMs), and allows other clients to perform I/O without any interference.
- First Come Strategy: This strategy allows active clients to use all of their allotted bandwidth, possibly at the expense of new clients.
- Fair Share Strategy: This strategy tries to share bandwidth among all clients. The Low Share class is used in conjunction with the First Come Strategy and Fair Share Strategy to prioritize some clients over other non-configured clients.
Quantum recommends that you choose one strategy, although the First Come and Fair Share classes can be mixed in a configuration. Choosing a single strategy at the beginning does not preclude using a more complicated strategy later.
Consider a case in which you have 1000MB/s available and four clients that would like to do 300MB/s each. Under the First Come Strategy, the first three clients to request and use 300MB/s will retain their bandwidth, while the fourth client will have to wait for more bandwidth to become available. The Fair Share Strategy splits the 1000MB/s between all four clients, such that all of them would have 250MB/s available.
The configuration can be modified by updating the configuration file and then signaling the FSM to read the configuration file.
There are four allocation classes. The class that a client is assigned to determines how bandwidth is allocated to that client. The higher priority classes can steal bandwidth from lower priority classes.
- First Come (FC): This is the highest priority class. Clients configured for this class of service will either get all the bandwidth they are assigned or be denied the allocation if that bandwidth is not available.
- Fair Share (FS): This is the second highest priority class. Clients configured for this class of service get their configured bandwidth, or share the total bandwidth allocated to all clients in this class in proportion to their configured amounts.
- Low Share (DS): This is the third highest priority class. Clients assigned to this class of service get their configured bandwidth or share the total bandwidth allocated to all clients in this class, in proportion to their configured amounts.
- Mover Class: This is the lowest priority class. Whatever bandwidth is left is proportionally distributed to the clients in this class.
QBM Allocator applies the following rules when allocating bandwidth to a client. Bandwidth is allocated on a stripe group basis.
- Every QoS-enabled stripe group must be configured with its bandwidth capacity. This value is used to set the global free bandwidth pool for the stripe group.
- All clients are configured with a class, and a minimum and maximum amount of bandwidth. Some clients are explicitly configured, and some use the default values. The minimum value is considered the desired value. This is the value the QBM allocator attempts to give each client.
- The administrator can configure an optional reserve pool per class.
- Each client must register with the QBM Allocator. If the QBM configuration has information about this client, this information determines the bandwidth assigned to the client. If the client does not have any configuration, QBM uses the default values.
The rules for processing a client’s bandwidth allocation request are given below. Each rule is applied until the client’s bandwidth request is satisfied, or all the rules have been applied.
- Locate the registration information for this client. Determine the client’s configured minimum and maximum bandwidth, and the client’s configured class.
- If a reserve pool is available for this class and has available bandwidth, allocate as much of the bandwidth from this reserve pool as possible.
- Try to allocate the minimum value from the stripe group’s global free bandwidth pool.
- If there are clients with more than their minimum bandwidth allocation, take that extra bandwidth away from those clients. Start with the lowest priority clients to the highest priority clients.
- Steal bandwidth from lower priority clients. Take only what is needed to satisfy the request. If a lower priority client does not have enough to satisfy the remaining needed amount, take all the bandwidth except what is considered the absolute minimum amount (currently set at 32K) from that client. Take from the lowest priority to the highest priority that is below the requesting client’s priority. That is, you cannot steal from your own priority.
- The FS, and DS and Mover classes are sharing classes. If the algorithm has taken bandwidth from any of these classes, redistribute the total bandwidth in those classes. Redistribute the bandwidth in proportion to what the client’s minimum bandwidth requirement is, or the client’s requested amount. If the client requested less than the client’s minimum, redistribute in proportion to that lesser amount. Send the new bandwidth allocations to all affected clients.
- All clients always get at least the absolute minimum bandwidth amount, even if the total bandwidth is oversubscribed. This prevents clients from freezing up when doing I/O if they have 0 bandwidth allocated.
When a client's bandwidth has been freed, the following rules are used to return the bandwidth. Each rule is applied until all freed bandwidth is returned.
- If this bandwidth was taken from the class reserve, give it back to the reserve.
- If there is any oversubscription of bandwidth, used the freed bandwidth to eliminate the oversubscription.
- If there are any clients that have an allocation less than their requested minimum, give the bandwidth to the highest priority client that currently has bandwidth below its minimum, while trying to satisfy the oldest request first. All clients must be at their minimum bandwidth before the next rule is applied.
- Give the bandwidth to the highest priority client requesting more bandwidth (above the minimum.)
- Put the bandwidth into the free pool.
A client requesting less bandwidth is always granted the request. Use the Freeing Bandwidth on a Client algorithm.
A client that requests more bandwidth can only get more than its minimum bandwidth if there is bandwidth available in the free pool or reserve pool, or if lower priority clients have extra bandwidth. The following rules apply.
- Get the bandwidth from the free pool if there is some bandwidth available (above a minimum allocation amount, currently 256K). Give that amount to the client.
- Take bandwidth from lower priority clients that have extra bandwidth (above their configured minimum requested amount.)
- If there is no bandwidth available, mark the client’s current allocation as wanting more bandwidth and return the client’s request with VOP_EAGAIN.
Assume that a client has previously asked for less bandwidth than its configured minimum bandwidth. But now the client has determined that it needs to get as much as it can, up to the configured minimum bandwidth. Sometimes, a client may not ask for the configured minimum bandwidth because of class bandwidth sharing. The client may have received an asynchronous message from the QBM Allocator that its bandwidth was reduced (possibly because of sharing).
The last amount the QBM Allocator told it could use is the amount it should ask for when it does the restore. The amount needed is the difference between the current allocation and the requested restore amount. The allocation rules for obtaining the needed amount is the same as for the section Allocating Bandwidth to a Client.
Gating I/O on the client
When gating (throttling) of I/O to a client becomes necessary, the leaky bucket algorithm is used.
- The client’s allocated bandwidth is divided into time periods.
- A token represents a single byte of I/O per time period. Each I/O byte sent to or received from the output device takes one token out of the bucket.
- All I/Os are all or nothing — QBM never performs partial I/Os. If the bucket is empty, no more I/O can be sent to the SG associated with the empty bucket. In this case, the I/O thread is blocked and put on a sleep queue.
- When the bucket is replenished, the sleep queue is checked. If there is a sleeping thread for which there now are enough tokens to satisfy the associated I/O request, the thread is awakened and allowed to perform its I/O operation.
- All I/Os are done in the order they are received. After a thread blocks on a bucket, no other threads can perform any I/O associated with that bucket until the first blocked thread has enough tokens to satisfy its I/O request.
QBM running on a client keeps statistics on usage over two time periods, called the Fast and Slow periods. The Fast period is 1second, and the Slow period is 30 seconds. These time periods were selected because they seemed to provide adequate intervals for testing the client’s I/O rate.
The Fast period is used to determine if a client needs to restore the bandwidth it was originally given after it has asked for less bandwidth. The heuristic used to determine if a client needs to restore bandwidth is to test if the client has had to sleep at least 50% of the time periods in the Fast period. If it has had to sleep at least 50% of the time periods and it has asked for less than what it was allocated, it sends a Restore_BW request to the FSM. The bandwidth is then immediately restored, without waiting for a response from the FSM. Using this algorithm, it can take up to 1.44 seconds for the client to determine that it needs to restore the original bandwidth (BW).
The Slow period is used to determine if QBM needs to increase or decrease its current bandwidth allocation. The heuristic for asking the FSM to increase the allocated bandwidth is if the client is using at least 80% of its currently allocated bandwidth during the entire Slow period. If it has been using at least 80%, the client asks for a 30% increase in the rate. The FSM may grant the entire requested amount of increase, it may grant a partial increase, or it may not grant any increase. The client must wait for the response before the bandwidth allotment is changed. If the FSM did not grant any increase, the request to increase will cause the FSM to set a flag to wait for available bandwidth at a later time. If the client is using less than 60% of its allocated bandwidth during the Slow period, it sends a DEC_BW request to the FSM asking for a 10% decrease in bandwidth. This request is granted by the FSM. The client immediately decreases its bandwidth rate by 10%.