NAS-Related Performance Issues |
CT: Imagine a situation in which the customer is complaining about slow backup speeds between the media server and the DXi. In addition, the customer says that their opimized duplication jobs are failing.
Unfortunately, in order to resolve NAS related performance issues we have to take on the role of the customer’s network administrator. Some customers have great network admins and others are not as fortunate. It is the latter who present the most challenge for our service teams when determining “poor performance” when it comes to their backup solution.
Professionals make a living at only troubleshooting networks and troubleshooting NAS related performance issues from a Quantum service perspective is one of the most challenging things we encounter. The purpose of this performance troubleshooting page is to help us quickly identify bottlenecks without becoming an expert in all things networking.
CT: Do you want to include something like this for team members to use or will there be too much variance to provide any sort of template? You can use the template (MAKE THIS A DOWNLOADABLE DOC) to help you organize the output from your analysis.
Remember, you can use the Export as PDF option in the left-hand pane if you want to save a copy to your desktop for use at the customer site.
Network administrators use something called the OSI model to help them understand network issues and identify bottlenecks. From a Quantum perspective we’re really only interested in layers 1-3. If you want to fully understand the layers in the OSI model , read more about it on the internet by reviewing the following information:
First the physical layer should be verified. This can be done using the following steps:
For layers 2 and 3, look at aggregation. Testing layer 1 actually involved your typical layer 2 and 3, MAC and IP, communication. We only care about these layers in the customer’s network when it comes to aggregation.
Cisco aggregation is referred to as Etherchannel Technology . Reading about Etherchannel Technology
from Cisco will provide an understanding for most aggregation solutions from other vendors as well. For the purposes of this document and solutions with a DXi system, keep these things in mind:
CT: Should these be listed as steps? In other words, is there an order here?
There will always be customers who think 80MB/s is not adequate on a single 1 GbE connection but 85 is about our average when it comes to Windows. Even when going direct there are a lot of things that play into this that need to be changed in the registry to be optimized. In my experience I’ve always been able to find a system that can show higher throughput to the DXi somewhere on the network proving that it is an issue local to the backup server.
I have yet to see DART Ethernet statistics to be inaccurate. With one customer I had to capture packets on the DXi using tcpdump then open the capture file in wireshark. Wireshark numbers were identical to DART numbers.
It is easy to get caught up in a NAS related performance issue for weeks or months with a customer who usually thinks there is no problem with their network. Our focus in service should always be helping the customer to identify the problem then hand off to them if we don’t have the resources to resolve it.
Using the processes above should make it easier to identify where the problem is. Showing adequate throughput between the backup server/s and the DXi would prove that the bottleneck is mostly like due to the network between the backup clients or is due to configuration of the backup software.
If the customer is not in agreement with the findings from above where adequate performance between the DXi and the backup server has been proven, then communicate this to management and an exit strategy will be discussed.
Using the troubleshooting methods in this guide will hopefully help us all save time when faced with challenging NAS performance issues. This guide lives on qwikipedia so that it can be contributed to by all service members. There is still a lot to be added such as Linux specific testing, CIFS vs NFS and more educational links. For now, please send any ideas, comments or suggestions to me at ryan.davies@quantum.com
Notes |
Attachments |
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |