← Back to team overview

openstack team mailing list archive

Re: Expanding Storage - Rebalance Extreeemely Slow (or Stalled?)

 

Folks,

This is the 3rd day and I see no or very little (kb.s) change with the new
disks.

Could it be normal, is there a long computation process that takes time
first before actually filling newly added disks?

Or should I just start from scratch with the "create" command this time.
The last time I did it, I didn't use the "swift-ring-builder create 20 3 1
.." command first but just started with "swift-ring-builder add ..." and
used existing ring.gz files, thinking otherwise I could be reformatting the
whole stack. I'm not sure if that's the case.

Please advise. Thanks,

--
Emre

On Mon, Oct 22, 2012 at 12:09 PM, Emre Sokullu <emre@xxxxxxxxxxxxxx> wrote:

> Hi Samuel,
>
> Thanks for quick reply.
>
> They're all 100. And here's the output of swift-ring-builder
>
> root@proxy1:/etc/swift# swift-ring-builder account.builder
> account.builder, build version 13
> 1048576 partitions, 3 replicas, 3 zones, 12 devices, 0.00 balance
> The minimum number of hours before a partition can be reassigned is 1
> Devices:    id  zone      ip address  port      name weight partitions
> balance meta
>              0     1     192.168.1.3  6002    c0d1p1 100.00     262144
>  0.00
>              1     1     192.168.1.3  6002    c0d2p1 100.00     262144
>  0.00
>              2     1     192.168.1.3  6002    c0d3p1 100.00     262144
>  0.00
>              3     2     192.168.1.4  6002    c0d1p1 100.00     262144
>  0.00
>              4     2     192.168.1.4  6002    c0d2p1 100.00     262144
>  0.00
>              5     2     192.168.1.4  6002    c0d3p1 100.00     262144
>  0.00
>              6     3     192.168.1.5  6002    c0d1p1 100.00     262144
>  0.00
>              7     3     192.168.1.5  6002    c0d2p1 100.00     262144
>  0.00
>              8     3     192.168.1.5  6002    c0d3p1 100.00     262144
>  0.00
>              9     1     192.168.1.3  6002    c0d4p1 100.00     262144
>  0.00
>             10     2     192.168.1.4  6002    c0d4p1 100.00     262144
>  0.00
>             11     3     192.168.1.5  6002    c0d4p1 100.00     262144
>  0.00
>
> On Mon, Oct 22, 2012 at 12:03 PM, Samuel Merritt <sam@xxxxxxxxxxxxxx>
> wrote:
> > On 10/22/12 9:38 AM, Emre Sokullu wrote:
> >>
> >> Hi folks,
> >>
> >> At GROU.PS, we've been an OpenStack SWIFT user for more than 1.5 years
> >> now. Currently, we hold about 18TB of data on 3 storage nodes. Since
> >> we hit 84% in utilization, we have recently decided to expand the
> >> storage with more disks.
> >>
> >> In order to do that, after creating a new c0d4p1 partition in each of
> >> the storage nodes, we ran the following commands on our proxy server:
> >>
> >> swift-ring-builder account.builder add z1-192.168.1.3:6002/c0d4p1 100
> >> swift-ring-builder container.builder add z1-192.168.1.3:6002/c0d4p1 100
> >> swift-ring-builder object.builder add z1-192.168.1.3:6002/c0d4p1 100
> >> swift-ring-builder account.builder add z2-192.168.1.4:6002/c0d4p1 100
> >> swift-ring-builder container.builder add z2-192.168.1.4:6002/c0d4p1 100
> >> swift-ring-builder object.builder add z2-192.168.1.4:6002/c0d4p1 100
> >> swift-ring-builder account.builder add z3-192.168.1.5:6002/c0d4p1 100
> >> swift-ring-builder container.builder add z3-192.168.1.5:6002/c0d4p1 100
> >> swift-ring-builder object.builder add z3-192.168.1.5:6002/c0d4p1 100
> >>
> >> [snip]
> >
> >>
> >> So right now, the problem is;  the disk growth in each of the storage
> >> nodes seems to have stalled,
> >
> > So you've added 3 new devices to each ring and assigned a weight of  100
> to
> > each one. What are the weights of the other devices in the ring? If
> they're
> > much larger than 100, then that will cause the new devices to end up
> with a
> > small fraction of the data you want on them.
> >
> > Running "swift-ring-builder <thing>.builder" will show you information,
> > including weights, of all the devices in the ring.
> >
> >
> >
> >> * Bonus question: why do we copy ring.gz files to storage nodes and
> >> how critical they are. To me it's not clear how Swift can afford to
> >> wait (even though it's just a few seconds ) for .ring.gz files to be
> >> in storage nodes after rebalancing- if those files are so critical.
> >
> >
> > The ring.gz files contain the mapping from Swift partitions to disks. As
> you
> > know, the proxy server uses it to determine which backends have the data
> for
> > a given request. The replicators also use the ring to determine where
> data
> > belongs so that they can ensure the right number of replicas, etc.
> >
> > When two storage nodes have different versions of a ring.gz file, you can
> > get replicator fights. They look like this:
> >
> > - node1's (old) ring says that the partition for a replica of
> /cof/fee/cup
> > belongs on node2's /dev/sdf.
> > - node2's (new) ring says that the same partition belongs on node1's
> > /dev/sdd.
> >
> > When the replicator on node1 runs, it will see that it has the partition
> for
> > /cof/fee/cup on its disk. It will then consult the ring, push that
> > partition's contents to node2, and then delete its local copy (since
> node1's
> > ring says that this data does not belong on node1).
> >
> > When the replicator on node2 runs, it will do the converse: push to
> node1,
> > then delete its local copy.
> >
> > If you leave the rings out of sync for a long time, then you'll end up
> > consuming disk and network IO ping-ponging a set of data around. If
> they're
> > out of sync for a few seconds, then it's not a big deal.
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References