← Back to team overview

openstack team mailing list archive

Re: [SWIFT] Change the partition power to recreate the RING

 

Chuck / John.
We are having 50.000 request per minute ( where 10.000+ are put from small
objects, from 10KB to 150KB )

We are using swift 1.7.4 with keystone token caching so no latency over
there.
We are having 12 proxyes and 24 datanodes divided in 4 zones ( each
datanode has 48gb of ram, 2 hexacore and 4 devices of 3TB each )

The workers that are puting objects in swift are seeing an awful
performance, and we too.
With peaks of 2secs to 15secs per put operations coming from the datanodes.
We tunes db_preallocation, disable_fallocate, workers and concurrency but
we cant reach the request that we need ( we need 24.000 put per minute of
small objects ) but we dont seem to find where is the problem, other than
from the datanodes.

Maybe worth pasting our config over here?
Thanks in advance.

alejandro
On 12 Jan 2013 02:01, "Chuck Thier" <cthier@xxxxxxxxx> wrote:

> Looking at this from a different perspective.  Having 2500 partitions
> per drive shouldn't be an absolutely horrible thing either.  Do you
> know how many objects you have per partition?  What types of problems
> are you seeing?
>
> --
> Chuck
>
> On Fri, Jan 11, 2013 at 3:28 PM, John Dickinson <me@xxxxxx> wrote:
> > If effect, this would be a complete replacement of your rings, and that
> is essentially a whole new cluster. All of the existing data would need to
> be rehashed into the new ring before it is available.
> >
> > There is no process that rehashes the data to ensure that it is still in
> the correct partition. Replication only ensures that the partitions are on
> the right drives.
> >
> > To change the number of partitions, you will need to GET all of the data
> from the old ring and PUT it to the new ring. A more complicated, but
> perhaps more efficient) solution may include something like walking each
> drive and rehashing+moving the data to the right partition and then letting
> replication settle it down.
> >
> > Either way, 100% of your existing data will need to at least be rehashed
> (and probably moved). Your CPU (hashing), disks (read+write), RAM
> (directory walking), and network (replication) may all be limiting factors
> in how long it will take to do this. Your per-disk free space may also
> determine what method you choose.
> >
> > I would not expect any data loss while doing this, but you will probably
> have availability issues, depending on the data access patterns.
> >
> > I'd like to eventually see something in swift that allows for changing
> the partition power in existing rings, but that will be
> hard/tricky/non-trivial.
> >
> > Good luck.
> >
> > --John
> >
> >
> > On Jan 11, 2013, at 1:17 PM, Alejandro Comisario <
> alejandro.comisario@xxxxxxxxxxxxxxxx> wrote:
> >
> >> Hi guys.
> >> We've created a swift cluster several months ago, the things is that
> righ now we cant add hardware and we configured lots of partitions thinking
> about the final picture of the cluster.
> >>
> >> Today each datanodes is having 2500+ partitions per device, and even
> tuning the background processes ( replicator, auditor & updater ) we really
> want to try to lower the partition power.
> >>
> >> Since its not possible to do that without recreating the ring, we can
> have the luxury of recreate it with a very lower partition power, and
> rebalance / deploy the new ring.
> >>
> >> The question is, having a working cluster with *existing data* is it
> possible to do this and wait for the data to move around *without data
> loss* ???
> >> If so, it might be true to wait for an improvement in the overall
> cluster performance ?
> >>
> >> We have no problem to have a non working cluster (while moving the
> data) even for an entire weekend.
> >>
> >> Cheers.
> >>
> >>
> >
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
> >
>

Follow ups

References