Network-Performance: Outbound Network Performance Affected by Socket Buffer, TCP Delay or TCP Window Size

 

SR Information: 1374230

 

Product / Software Version: 2.0.1 (article applies to all other DXi versions)

 

Problem Description: Outbound network performance (restore) is slow

  

OVERVIEW

This article shows how to troubleshoot slow outbound network performance on a restore, by using NetServer and NetPerf, checking to see if a TCP delay is causing the slowness, changing the socket and access settings on a Windows machine connected to a DXi, and changing the TCP window size.

 


Main Topics include:

 


STEP ONE - Troubleshoot to Confirm if Performance Slowness Is Caused by the Socket Buffer

This procedure shows how to test data transfer performance from the DXi to a server.

In this troubleshooting, the customer reported that their Windows Media Server (2K8) was experiencing very slow performance during restores. By using the procedure below, you can identify performance issues that could be caused by a misconfiguration of the socket buffer size.

But if you see an improvement when you raise the socket buffer size, it means that the server connected to the DXi must to be configured to work with a higher socket buffer size.

Note: Although this article is based on a Windows server, you can run netperf in cases where the DXi is connected to a non-Windows OS server in order to troubleshoot performance. By navigating to where the netperf command is located on the DXi, you can run the command 
 

/usr/bin/netperf


The netserver for Windows can be found on the Internet, but it is also attached to this article (you may want to check if there is an earlier version).

 

Running Netserver on the Windows Machine

Please ask the customer to do the followiing:
 

  1. Download the attached Netserver zip file to the media server, extract the files to a temporary directory, and unzip it
  2. Open a CLI (DOS prompt) window on the media server, and go to the directory/folder where the files were extracted.
  3. Execute the file netserver.exe
    Note: After the command runs, it will not go back to the DOS prompt. The customer should NOT close the window.
  4. Tell you the IP address of the machine (the IP of the interface that is connected to the DXi) .

Running Netperf on the DXi

After the customer has netserver.exe running on the media server, please execute the following  steps:

    1. Open a putty session to the DXi.

    2. Run the following commands (please save the output): 
 

netperf -H <ip-of-the-destination-host> -f M -l 120 -d -- -s 8192
netperf -H <ip-of-the-destination-host> -f M -l 120 –d -- -s 16384
netperf -H <ip-of-the-destination-host> -f M -l 120 –d -- -s 32768
netperf -H <ip-of-the-destination-host> -f M -l 120 –d -- -s 65536
netperf -H <ip-of-the-destination-host> -f M -l 120 –d -- -s 131072

 
In these commands:

-H  Is the destination host (where the data is sent to). The destination host must to be running the server portion of netperf.
-f   Specify how the output will be formatted. In this case, "M" means that output will be formatted and displayed in MB/s.
-l   Specify how long the test will run in seconds. In the example above, it will be run for 120 seconds.
-d  Increase  the output of debugging info (optional, but it may be helpful to document more verbose info).
--   Add this parameter when you want to add additional options to netperf. In the example below, the additional option is '-s' I.
-s  Specify the socket buffer size to be used during the data transmission test. Possible values are:
 
  65536    (64K)
131072  (128K)    
  16384     (16k)

Output Examples of Running netperf on a DXi:

This is an example of running netperf without '-d' (debug):

[root@asps-l-6500 bin]# netperf -H 10.20.234.51 -f M -l 30 -- -s 131072
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.20.234.51 (10.20.234.51) port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    MBytes/sec

4194304 262144 262144    30.10      54.28


This is an example running netperf with '-d' (debug):

[root@asps-l-6500 bin]# netperf -H 10.20.234.51 -f M -l 30 -d -- -s 131072
Program name: netperf
Local send alignment: 8
Local recv alignment: 8
Remote send alignment: 8
Remote recv alignment: 8
Report local CPU 0
Report remote CPU 0
Verbosity: 1
Debug: 1
Port: 12865
Test name: TCP_STREAM
Test bytes: 0 Test time: 30 Test trans: 0
Host name: 10.20.234.51

installing catcher for all signals
Could not install signal catcher for sig 9, errno 22
Could not install signal catcher for sig 19, errno 22
Could not install signal catcher for sig 32, errno 22
Could not install signal catcher for sig 33, errno 22
Could not install signal catcher for sig 65, errno 22
remotehost is 10.20.234.51 and port 12865
establish_control called with host '10.20.234.51' port '12865' remfam AF_UNSPEC
                local '0.0.0.0' port '0' locfam AF_UNSPEC
getaddrinfo returned the following for host '10.20.234.51' port '12865' family AF_UNSPEC
        cannonical name: '10.20.234.51'
        flags: 2 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
        sa_family: AF_INET sadata: 50 65 10 20 234 51
getaddrinfo returned the following for host '0.0.0.0' port '0' family AF_UNSPEC
        cannonical name: '0.0.0.0'
        flags: 3 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
        sa_family: AF_INET sadata: 0 0 0 0 0 0
bound control socket to 0.0.0.0 and 0
successful connection to remote netserver at 10.20.234.51 and 12865
complete_addrinfo using hostname 10.20.234.51 port 0 family AF_UNSPEC type SOCK_STREAM prot IPPROTO_TCP flags 0x0
getaddrinfo returned the following for host '10.20.234.51' port '0' family AF_UNSPEC
        cannonical name: '10.20.234.51'
        flags: 2 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
        sa_family: AF_INET sadata: 0 0 10 20 234 51
local_data_address not set, using local_host_name of '0.0.0.0'
complete_addrinfo using hostname 0.0.0.0 port 0 family AF_UNSPEC type SOCK_STREAM prot IPPROTO_TCP flags 0x1
getaddrinfo returned the following for host '0.0.0.0' port '0' family AF_UNSPEC
        cannonical name: '0.0.0.0'
        flags: 3 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
        sa_family: AF_INET sadata: 0 0 0 0 0 0
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.20.234.51 (10.20.234.51) port 0 AF_INET
create_data_socket: socket 4 obtained...
netperf: set_sock_buffer: send socket size determined to be 262144
netperf: set_sock_buffer: receive socket size determined to be 262144
send_tcp_stream: send_socket obtained...
recv_response: received a 256 byte response
remote listen done.
About to start a timer for 30 seconds.
recv_response: received a 256 byte response
remote results obtained
calculate_confidence: itr  1; time 30.024141; res  53.198323
                               lcpu -1.000000; rcpu -1.000000
                               lsdm -1.000000; rsdm -1.000000
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    MBytes/sec

4194304 262144 262144    30.02      53.20
shutdown_control: shutdown of control connection requested.
[root@asps-l-6500 bin]#

 


STEP TWO – Troubleshoot to Confirm if TCP Delay Is Causing Performance Issues

IMPORTANT NOTE:
Please make sure to save all output of each command for documentation and comparison. This report also will be helpful for escalation purposes. We recommend that you go through these troubleshooting steps and document in the escalation form, if further assistance is required.

 

  1. Make sure that you have the netserver running on the Windows server, as described above.
  2. Check if TCP DELAY could be affecting performance. To do so, execute the same commands above, adding ‘-D’.

Example:
netperf -H <ip-of-the-mediaserver> -f M -l 120 -d -- -s 8192 -D

Parameter:
-D sets the TCP_NODELAY option to true on both systems.

 

 


STEP THREE – Change Socket & Access Method on the Windows Machine Connected to the  DXi

In this procedure, we will change the socket size in Windows.

 

Notes:

 

Change the Socket Buffer Size 

Note: This maintenance is disruptive and will require media server downtime.

Check if there is a buffer socket configuration defined on the Windows server.
 

  1. Under HKLM (hkey local machine) go to the registry key:
    HKLM-SYSTEM\CurrentControlSet\services\AFD\Parameters

     

  2. Confirm if there are the following keys:

    DefaultSendWindow
    Type: DWORD
    Value: 65536

    DefaultReceiveWindow
    Type: DWORD
    Value: 65536
     
  3. Proceed as follows:
        * If the keys doesn’t exist, create them.
        * If the keys exists with the values above, change to the best socket buffer value you found on the troubleshooting you executed above.
     
  4. Close all applications and reboot the server.
     
  5. Execute another restore and take note of the following:
        * Start and Stop Date/Time of the job execution: ____________
        * Data Transfer Rate: ________
     
  6. Execute a backup job and take note of the following:
        * Start and Stop Date/Time of the job execution: ____________
        * Data Transfer Rate: ________

If the restore didn’t improve, go to step 2.

Ask the Customer to Confirm How the Mount Point Is Set on the Windows Machine

  1. If the mount point is set as //<host-name>/share, please ask customer to replace by //<ip-address>/share
     
  2. Execute another restore and take note of the following:
        * Start and Stop Date/Time of the job execution: ____________
        * Data Transfer Rate: ________
     
  3. Execute a backup job and take note of the following:
        * Start and Stop Date/Time of the job execution: ____________
        * Data Transfer Rate: ________
     

STEP FOUR – Another Performance Setting on Windows That May Affect Performance

Please advise the customer to get assistance from Microsoft. You may also want to check if tcpwindowsize may be the component affecting the restore performance.

What Is TCP Window Size?

The TCP receive window size is the amount of received data (in bytes) that can be buffered during a connection. The sending host can send only that amount of data before it must wait for an acknowledgment and window update from the receiving host.

This parameter may help when large files restore are involved.

How to Change tcpwindowsize in Windows 2003

Increase the tcpwindowsize value in the registry (information about the path for this registry key is available at:

 

http://support.microsoft.com/default.aspx?scid=kb;en-us;224829

 

Execute another large file restore and let the SES team know the results.

How to Change tcpwindowsize in Windows 2008

Unlike 2003, the new version doesn’t use the registry key to define the TCP window size. Windows 2008 uses its own autotuning, which sometimes may affect network performance.

To verify if autotuning is enabled, open the command prompt and use the following command:

   netsh interface tcp show global
 


Testing Performance with Autotuning Disabled

You may want to test performance with autotuning disabled. To disable autotuning, execute the following command:

netsh interface tcp set global autotuning=disabled

To enable auto-tuning again, execute the command:

netsh interface tcp set global autotuning=normal
 


 

 

Attachments
Title Last Updated Updated By
netperf-2.6.0-win-win7.zip
10/18/2012 07:20 PM Erika Eskenberry
netperf-2.6.0-win-vista-winsrv2k8.zip
10/18/2012 07:20 PM Erika Eskenberry


This page was generated by the BrainKeeper Enterprise Wiki, © 2018