← Back to team overview

openstack team mailing list archive

Re: glance performance gains via sendfile()

 

If one wants to experiment with the performance effects of sendfile(), the netperf benchmark <http://www.netperf.org/> has a "TCP_SENDFILE" test which complements the TCP_STREAM test. It can also report CPU utilization and service demand to allow a comparison of efficiency.
netperf -H <destination> -t TCP_SENDFILE -F <file> -c -C -l 30

will run a 30-second TCP_SENDFILE tests using <file> as the data source (one is created if no -F option is specified) sending to <destination> (assumes that netserver has been launched on <destination>. The corresponding TCP_STREAM test would be the obvious substitution.
One area of investigation would be the effect of send size on things. 
That can be accomplished with a "test-specific" (following a "--" on the 
command line) -m option:
netperf  ...as above...  -- -m 64K

would cause netperf to send 65536 bytes in each "send" call. The manual for the current top-of-trunk version of netperf is at:
http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html

and the top-of-trunk bits can be pulled via subversion pointing at http://www.netperf.org/svn/netperf2/trunk
happy benchmarking,

rick jones

For example, between a pair of Ubuntu 11.04 systems with Mellanox 10GbE, and a pair of X5650 processors each (so 24 "CPUs"):
~$ ./netperf -p 12866 -H ndestination -c -C -l 30 -- -P 12867 -m 64K
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 65536 30.00 9271.36 2.52 2.72 0.535 0.576
~$ ./netperf -t TCP_SENDFILE -p 12866 -H destination -c -C -l 30 -- -P 
12867 -m 64K
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to 
destination () port 12867 AF_INET : demo
Recv   Send    Send                          Utilization       Service 
Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 65536 30.00 9332.46 0.82 2.71 0.173 0.572
It would be good to repeat each a couple times, but in this case at 
least, we see a considerable drop in sending side CPU utilization and 
service demand, the latter being a direct measure of efficiency.
(the socket sizes are simply what they were at the onset of the 
connection, not by the end.  for that, use omni output selectors - 
http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html#Omni-Output-Selection 
- the test-specific -P option is to explicitly select port numbers for 
the data connection to deal with firewalls in my test environment - 
similarly for the global -p option selecting the port number on which 
netserver at destination is waiting)
With a smaller send size  the results may be a bit different:

~$ ./netperf -p 12866 -H destination -c -C -l 30 -- -P 12867
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 16384 30.00 9332.43 2.64 2.74 0.556 0.578 ~$ ./netperf -t TCP_SENDFILE -p 12866 -H destination -c -C -l 30 -- -P 12867 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 16384 30.00 9351.32 1.26 2.73 0.264 0.574
Mileage will vary depending on link-type, CPU's present, etc etc etc...


References