openstack team mailing list archive

Thread
Date

Re: shuffle(nodes) in Swift

To: openstack@xxxxxxxxxxxxxxxxxxx
From: Samuel Merritt <spam@xxxxxxxxxxxxx>
Date: Thu, 05 Jul 2012 14:48:20 -0700
In-reply-to: <4FF5BEEC.6030500@nexenta.com>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:13.0) Gecko/20120614 Thunderbird/13.0.1

On 7/5/12 9:21 AM, Anatoly Legkodymov wrote:

Good day,

File proxy/server.py contains following construction several times
(simplifed):
     nodes = ring.get_nodes()
     shuffle(nodes)
     for node in nodes:
         ...

I found it useful for balancing from the first sight, but deeper
investigation shows it doesn't happens. Moreover, iteration without
shuffle (reading always 1st replica first) will improve performance of
cloud.

Problem can be split in 2 scenarios: multiple clients read same object
simultaneously, multiple clients read same object from time to time
occasionally (periodically).

During simultaneous read - object will be read from disk only once,
later reads will be done from memory cache. In case of shuffle() - 3
read disk operations should be done. Object will be cached thrice on all
servers, consuming 3 times more cache memory. Same network bandwidth
will be used in case of shuffle() and without.

That is not necessarily true. By default, only objects of size <= 5 MiBand readable without authentication (i.e. in a public container) areallowed to remain in the kernel's buffer-cache. Private files and largerfiles are evicted as they are read. Seeswift.obj.server.DiskFile.__iter__ for the details.

With that in mind, consider a 250 MiB object. Since the object inquestion does not remain in buffer-cache after it is read from disk, theobject server is limited by available disk IO. Thus, when there aremultiple simultaneous GET requests for a single object, reading from 3disks* will be 3x faster than reading from one disk, so the shuffle()increases throughput.

For small (<= 5 MiB), public objects, it is true that 3 copies will livein the caches of 3 different machines. However, given the throughputincrease in the many-simultaneous-readers case, it's a worthwhile tradeoff.


* or whatever the replica count is

References

shuffle(nodes) in Swift
From: Anatoly Legkodymov, 2012-07-05