← Back to team overview

openstack team mailing list archive

Re: [Swift] Cache pressure tuning

 

Jonathan,

Yes we have 10,000 containers and we're using COSBench to do the tests.


Sincerely, Yuan


On Wed, Jun 19, 2013 at 9:24 AM, Jonathan Lu <jojokururu@xxxxxxxxx> wrote:

>  Hi, Zhou,
>     BTW, in test case 2, the number of container is 10,000 or just 10?
>
>
> Jonathan Lu
>
> On 2013/6/18 19:18, ZHOU Yuan wrote:
>
> Jonathan, we happen to use SN similar with yours and I could share you
> some performance testing data here:
>
>  1. 100 container with 10000 objects(base)
>  The performance is quite good and can hit HW bottleneck
>
>  2. 10kcontainer with 100M objects
>  The performance is not so good, which dropped 80% compared with base
>
>  3. 1container with 1000M objects
>  The performance is not so good, which dropped 95% compared with base
>
> The suspect reason we found are:
> 1) XFS's overhead W/ huge # of objects. Deleting some files wouldn't help
> since the inode allocation is quite sparse on the disks already and later
> inode_lookup should costs more disk seek time I guess.  But this could be
> greatly improved by setting vfs_cache_presure to a lower value and it
> should be safe even we set it to 1 since Swift does use cache at all. If we
> could cache all the inode and the performance would become good again.We've
> done some tests with precached inode(simply run 'ls -R /srv/nodes') and
> verified the performance is quite good.
>
>  2) SQLite DB performance bottleneck when there are millions of records
> in a single DB. There is a BP to auto split large database but not
> implemented yet
>
>
> Hope this can help.
>
> --
> Sincerely,  Yuan
>
> On Tue, Jun 18, 2013 at 1:56 PM, Jonathan Lu <jojokururu@xxxxxxxxx> wrote:
>
>>  Hi, Huang
>>     Thanks a lot. I will try this test.
>>
>>     One more question:
>>     In the 3 following situation, will the base line performance be quite
>> different?
>>         1. only 1 sontaienr with 10m objects;
>>         2. 100,000 objects per container at 100 containers
>>         3. 1,000 objects per container at 10,000 containers
>>
>> Cheers
>> Jonathan Lu
>>
>>
>> On 2013/6/18 12:54, Huang Zhiteng wrote:
>>
>>
>>
>> On Tue, Jun 18, 2013 at 12:35 PM, Jonathan Lu <jojokururu@xxxxxxxxx>wrote:
>>
>>>  Hi, Huang
>>>     Thanks for you explanation. Does it mean that the storage cluster of
>>> specific processing ability will be slower and slower with more and more
>>> objects? Is there any test about the rate of the decline or is there any
>>> lower limit?
>>>
>>>     For example, my environment is:
>>>
>>>
>>> 	Swift version : grizzly
>>> 	Tried on Ubuntu 12.04
>>> 	3 Storage-nodes : each for 16GB-ram / CPU 4*2 / 3TB*12
>>>
>>>     The expected* *throughout is more than 100/s with uploaded objects
>>> of 50KB. At the beginning it works quite well and then it drops. If this
>>> degradation is unstoppable, I'm afraid that the performance will finally
>>> not be able to meet our needs no matter how I tuning other config.
>>>
>>>   It won't be hard to do a base line performance (without inode cache)
>> assessment of your system: populate your system with certain mount of
>> objects with desired size (say 50k, 10million objects <1,000 objects per
>> container at 10,000 containers>), and *then drop VFS caches explicitly
>> before testing*.  Measure performance with your desired IO pattern and in
>> the mean time drop VFS cache every once in a while (say every 60s). That's
>> roughly the performance you can get when your storage system gets into a
>> 'steady' state (i.e. objects # has out grown memory size).  This will give
>> you idea of pretty much the worst case.
>>
>>
>>>  Jonathan Lu
>>>
>>>
>>> On 2013/6/18 11:05, Huang Zhiteng wrote:
>>>
>>>
>>> On Tue, Jun 18, 2013 at 10:42 AM, Jonathan Lu <jojokururu@xxxxxxxxx>wrote:
>>>
>>>> On 2013/6/17 18:59, Robert van Leeuwen wrote:
>>>>
>>>>>  I'm facing the issue about the performance degradation, and once I
>>>>>> glanced that changing the value in /proc/sys
>>>>>> /vm/vfs_cache_pressure will do a favour.
>>>>>> Can anyone explain to me whether and why it is useful?
>>>>>>
>>>>> Hi,
>>>>>
>>>>> When this is set to a lower value the kernel will try to keep the
>>>>> inode/dentry cache longer in memory.
>>>>> Since the swift replicator is scanning the filesystem continuously it
>>>>> will eat up a lot of iops if those are not in memory.
>>>>>
>>>>> To see if a lot of cache misses are happening, for xfs, you can look
>>>>> at xs_dir_lookup and xs_ig_missed.
>>>>> ( look at http://xfs.org/index.php/Runtime_Stats )
>>>>>
>>>>> We greatly benefited from setting this to a low value but we have
>>>>> quite a lot of files on a node ( 30 million)
>>>>> Note that setting this to zero will result in the OOM killer killing
>>>>> the machine sooner or later.
>>>>> (especially if files are moved around due to a cluster change ;)
>>>>>
>>>>> Cheers,
>>>>> Robert van Leeuwen
>>>>>
>>>>
>>>>  Hi,
>>>>     We set this to a low value(20) and the performance is better than
>>>> before. It seems quite useful.
>>>>
>>>>     According to your description, this issue is related with the
>>>> object quantity in the storage. We delete all the objects in the storage
>>>> but it doesn't help anything. The only method to recover is to format and
>>>> re-mount the storage node. We try to install swift on different environment
>>>> but this degradation problem seems to be an inevitable one.
>>>>
>>> It's inode cache for each file(object) helps (reduce extra disk IOs).
>>> As long as your memory is big enough to hold inode information of those
>>> frequently accessed objects, you are good.  And there's no need (no point)
>>> to limit # of objects for each storage node IMO.  You may manually load
>>> inode information of objects into VFS cache if you like (by simply 'ls'
>>> files), to _restore_ performance.  But still memory size and object access
>>> pattern are the key to this kind of performance tuning, if memory is too
>>> small, inode cache will be invalided sooner or later.
>>>
>>>
>>>
>>>> Cheers,
>>>> Jonathan Lu
>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~openstack
>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~openstack
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>
>>>
>>>
>>> --
>>> Regards
>>> Huang Zhiteng
>>>
>>>
>>>
>>
>>
>> --
>> Regards
>> Huang Zhiteng
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
>
>
>

References