← Back to team overview

openstack team mailing list archive

Re: glance performance gains via sendfile()

 

If one wants to experiment with the performance effects of sendfile(), the netperf benchmark <http://www.netperf.org/> has a "TCP_SENDFILE" test which complements the TCP_STREAM test. It can also report CPU utilization and service demand to allow a comparison of efficiency.

netperf -H <destination> -t TCP_SENDFILE -F <file> -c -C -l 30

will run a 30-second TCP_SENDFILE tests using <file> as the data source (one is created if no -F option is specified) sending to <destination> (assumes that netserver has been launched on <destination>. The corresponding TCP_STREAM test would be the obvious substitution.

One area of investigation would be the effect of send size on things. That can be accomplished with a "test-specific" (following a "--" on the command line) -m option:

netperf  ...as above...  -- -m 64K

would cause netperf to send 65536 bytes in each "send" call. The manual for the current top-of-trunk version of netperf is at:

http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html

and the top-of-trunk bits can be pulled via subversion pointing at http://www.netperf.org/svn/netperf2/trunk

happy benchmarking,

rick jones

For example, between a pair of Ubuntu 11.04 systems with Mellanox 10GbE, and a pair of X5650 processors each (so 24 "CPUs"):

~$ ./netperf -p 12866 -H ndestination -c -C -l 30 -- -P 12867 -m 64K
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 65536 30.00 9271.36 2.52 2.72 0.535 0.576

~$ ./netperf -t TCP_SENDFILE -p 12866 -H destination -c -C -l 30 -- -P 12867 -m 64K TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 65536 30.00 9332.46 0.82 2.71 0.173 0.572

It would be good to repeat each a couple times, but in this case at least, we see a considerable drop in sending side CPU utilization and service demand, the latter being a direct measure of efficiency.

(the socket sizes are simply what they were at the onset of the connection, not by the end. for that, use omni output selectors - http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html#Omni-Output-Selection - the test-specific -P option is to explicitly select port numbers for the data connection to deal with firewalls in my test environment - similarly for the global -p option selecting the port number on which netserver at destination is waiting)

With a smaller send size  the results may be a bit different:

~$ ./netperf -p 12866 -H destination -c -C -l 30 -- -P 12867
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 16384 30.00 9332.43 2.64 2.74 0.556 0.578 ~$ ./netperf -t TCP_SENDFILE -p 12866 -H destination -c -C -l 30 -- -P 12867 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 12867 AF_INET to destination () port 12867 AF_INET : demo Recv Send Send Utilization Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size Size Size Time Throughput local remote local remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

87380 16384 16384 30.00 9351.32 1.26 2.73 0.264 0.574

Mileage will vary depending on link-type, CPU's present, etc etc etc...


References