← Back to team overview

openstack team mailing list archive

Re: Swift Consistency Guarantees?

 

I'm finding this thread a bit confusing. You're comparing offered SERVICES
to Software. While some of the details of the software will dictate what's
possible, some are heavily dependent on how you deploy the swift software,
and what kind of deployment decisions you (or your service provider) make.

As an extreme example - if you deploy 1 container server in a highly
available fashion (hardware style), the you probably could get consistent
container listings in the different update followed by read scenarios.
Hosting huge swift installations with such a setup is not realistic - but
that doesn't say you can't do that.

Similarly, swift offers quite a lot of flexibility in setting the eventual
consistency window sizes (replication frequency, rates and such). So, while
there are theoretical answers to missing replicas, the likelihood of those
occurring depends on your deployment and operational practices employed.
(e.g. how many replicas are made, how quickly are failed nodes/drives fixed
and their content replicated to their replacement etc).
In the amazon case, much of this is captured in the 17 9's or the 3 9's
guarantees for the reduced redundancy class.

If your approach is from an API perspective, then issues around # of
replicas (which is deployment parameter) are probably not relevant - if you
trust your provider.

If your approach in this is from a Swift developer / deployer perspective
 - then nvm. keep asking, cause it's much easier to read email than python
;)






On Fri, Jan 20, 2012 at 3:06 PM, Chmouel Boudjnah <chmouel@xxxxxxxxxxxxx>wrote:

> As Stephen mentionned if there is only one replica left Swift would not
> serve it.
>
> Chmouel.
>
>
> On Fri, Jan 20, 2012 at 1:58 PM, Nikolaus Rath <Nikolaus@xxxxxxxx> wrote:
>
>> Hi,
>>
>> Sorry for being so persistent, but I'm still not sure what happens if
>> the 2 servers that carry the new replica are down, but the 1 server that
>> has the old replica is up. Will GET fail or return the old replica?
>>
>> Best,
>> Niko
>>
>> On 01/20/2012 02:52 PM, Stephen Broeker wrote:
>> > By default there are 3 replicas.
>> > A PUT Object will return after 2 replicas are done.
>> > So if all nodes are up then there are at least 2 replicas.
>> > If all replica nodes are down, then the GET Object will fail.
>> >
>> > On Fri, Jan 20, 2012 at 11:21 AM, Nikolaus Rath <Nikolaus@xxxxxxxx
>> > <mailto:Nikolaus@xxxxxxxx>> wrote:
>> >
>> >     Hi,
>> >
>> >     So if an object update has not yet been replicated on all nodes,
>> and all
>> >     nodes that have been updated are offline, what will happen? Will
>> swift
>> >     recognize this and give me an error, or will it silently return the
>> >     older version?
>> >
>> >     Thanks,
>> >     Nikolaus
>> >
>> >
>> >     On 01/20/2012 02:14 PM, Stephen Broeker wrote:
>> >     > If a node is down, then it is ignored.
>> >     > That is the whole point about 3 replicas.
>> >     >
>> >     > On Fri, Jan 20, 2012 at 10:43 AM, Nikolaus Rath <
>> Nikolaus@xxxxxxxx
>> >     <mailto:Nikolaus@xxxxxxxx>
>> >     > <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>> wrote:
>> >     >
>> >     >     Hi,
>> >     >
>> >     >     What happens if one of the nodes is down? Especially if that
>> >     node holds
>> >     >     the newest copy?
>> >     >
>> >     >     Thanks,
>> >     >     Nikolaus
>> >     >
>> >     >     On 01/20/2012 12:33 PM, Stephen Broeker wrote:
>> >     >     > The X-Newest header can be used by a GET Operation to ensure
>> >     that
>> >     >     all of the
>> >     >     > Storage Nodes (3 by default) are queried for the latest
>> copy of
>> >     >     the Object.
>> >     >     > The COPY Object operation already has this functionality.
>> >     >     >
>> >     >     > On Fri, Jan 20, 2012 at 9:12 AM, Nikolaus Rath
>> >     <Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>
>> >     >     <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>
>> >     >     > <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>
>> >     <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>>> wrote:
>> >     >     >
>> >     >     >     Hi,
>> >     >     >
>> >     >     >     No one able to further clarify this?
>> >     >     >
>> >     >     >     Does swift offer there read-after-create consistence
>> like
>> >     >     >     non-us-standard S3? What are the precise syntax and
>> >     semantics of
>> >     >     >     X-Newest header?
>> >     >     >
>> >     >     >     Best,
>> >     >     >     Nikolaus
>> >     >     >
>> >     >     >
>> >     >     >     On 01/18/2012 10:15 AM, Nikolaus Rath wrote:
>> >     >     >     > Michael Barton <mike-launchpad@xxxxxxxxxxxxxxxx
>> >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx>
>> >     >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx
>> >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx>>
>> >     >     >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx
>> >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx>
>> >     >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx
>> >     <mailto:mike-launchpad@xxxxxxxxxxxxxxxx>>>> writes:
>> >     >     >     >> On Tue, Jan 17, 2012 at 4:55 PM, Nikolaus Rath
>> >     >     <Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>
>> >     <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>
>> >     >     >     <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>
>> >     <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>>> wrote:
>> >     >     >     >>> Amazon S3 and Google Storage make very explicit
>> (non-)
>> >     >     consistency
>> >     >     >     >>> guarantees for stored objects. I'm looking for a
>> similar
>> >     >     >     documentation
>> >     >     >     >>> about OpenStack's Swift, but haven't had much
>> success.
>> >     >     >     >>
>> >     >     >     >> I don't think there's any documentation on this, but
>> >     it would
>> >     >     >     probably
>> >     >     >     >> be good to write up.  Consistency in Swift is very
>> >     similar
>> >     >     to S3.
>> >     >     >     >> That is, there aren't many non-eventual consistency
>> >     guarantees.
>> >     >     >     >>
>> >     >     >     >> Listing updates can happen asynchronously (especially
>> >     under
>> >     >     >     load), and
>> >     >     >     >> older versions of files can show up in requests
>> (deletes
>> >     >     are just a
>> >     >     >     >> new "deleted" version of the file).
>> >     >     >     >
>> >     >     >     > Ah, ok. Thanks a lot for stating this so explicitly.
>> >     There seems
>> >     >     >     to be a
>> >     >     >     > lot of confusion about this, now I can at least point
>> >     people to
>> >     >     >     > something.
>> >     >     >     >
>> >     >     >     >> Swift can generally be relied on for read-after-write
>> >     >     consistency,
>> >     >     >     >> like S3's regions other than the the US Standard
>> region.
>> >     >      The reason
>> >     >     >     >> S3 in US Standard doesn't have this guarantee is
>> because
>> >     >     it's more
>> >     >     >     >> geographically widespread - something Swift isn't
>> good at
>> >     >     yet.  I can
>> >     >     >     >> imagine we'll have the same limitation when we get
>> there.
>> >     >     >     >
>> >     >     >     > Do you mean read-after-create consistency? Because
>> >     below you
>> >     >     say about
>> >     >     >     > read-after-write:
>> >     >     >     >
>> >     >     >     >>> - If I receive a (non-error) response to a PUT
>> >     request, am I
>> >     >     >     guaranteed
>> >     >     >     >>> that the object will be immediately included in all
>> >     object
>> >     >     >     listings in
>> >     >     >     >>> every possible situation?
>> >     >     >     >>
>> >     >     >     >> Nope.
>> >     >     >     >
>> >     >     >     > ..so is there such a guarantee for PUTs of *new*
>> objects
>> >     >     (like S3 non
>> >     >     >     > us-classic), or does "can generally be relied on" just
>> >     mean
>> >     >     that the
>> >     >     >     > chances for new puts are better?
>> >     >     >     >
>> >     >     >     >> Also like S3, Swift can't make any strong guarantees
>> >     about
>> >     >     >     >> read-after-update or read-after-delete consistency.
>> >      We do
>> >     >     have an
>> >     >     >     >> "X-Newest" header that can be added to GETs and
>> HEADs to
>> >     >     make the
>> >     >     >     >> proxy do a quorum of backend servers and return the
>> >     newest
>> >     >     available
>> >     >     >     >> version, which greatly improves these, at the cost of
>> >     latency.
>> >     >     >     >
>> >     >     >     > That sounds very interesting. Could you give some more
>> >     >     details on what
>> >     >     >     > exactly is guaranteed when using this header? What
>> happens
>> >     >     if the
>> >     >     >     server
>> >     >     >     > having the newest copy is down?
>> >     >     >     >
>> >     >     >     >>> - If the swift server looses an object, will the
>> >     object name
>> >     >     >     still be
>> >     >     >     >>> returned in object listings? Will attempts to
>> >     retrieve it
>> >     >     result
>> >     >     >     in 404
>> >     >     >     >>> errors (as if it never existed) or a different
>> error?
>> >     >     >     >>
>> >     >     >     >> It will show up in listings, but give a 404 when you
>> >     attempt to
>> >     >     >     >> retrieve it.  I'm not sure how we can improve that
>> >     with Swift's
>> >     >     >     >> general model, but feel free to make suggestions.
>> >     >     >     >
>> >     >     >     > From an application programmers point of view, it
>> >     would be very
>> >     >     >     helpful
>> >     >     >     > if lost objects could be distinguished from
>> non-existing
>> >     >     object by a
>> >     >     >     > different HTTP error. Trying to access a non-existing
>> >     object may
>> >     >     >     > indicate a bug in the application, so it would be
>> nice to
>> >     >     know when it
>> >     >     >     > happens.
>> >     >     >     >
>> >     >     >     > Also, it would be very helpful if there was a way to
>> list
>> >     >     all lost
>> >     >     >     > objects without having to issue HEAD requests for
>> every
>> >     >     stored object.
>> >     >     >     > Could this information be added to the XML and JSON
>> >     output of
>> >     >     >     container
>> >     >     >     > listings? Then an application would have the chance to
>> >     >     periodically
>> >     >     >     > check for lost data, rather than having to handle all
>> lost
>> >     >     objects at
>> >     >     >     > the instant they're required.
>> >     >     >     >
>> >     >     >     >
>> >     >     >     > I am working on a swift backend for S3QL
>> >     >     >     > (http://code.google.com/p/s3ql/), a program that
>> exposes
>> >     >     online cloud
>> >     >     >     > storage as a local UNIX file system. To prevent data
>> >     >     corruption, there
>> >     >     >     > are two requirements that I'm currently struggling to
>> >     >     provide with the
>> >     >     >     > swift backend:
>> >     >     >     >
>> >     >     >     > - There needs to be a way to reliably check if one
>> object
>> >     >     (holding the
>> >     >     >     >   file system metadata) is the newest version.
>> >     >     >     >
>> >     >     >     >   The S3 backend does this by requiring storage in
>> the non
>> >     >     us-classic
>> >     >     >     >   regions and using list-after-create consistency
>> with a
>> >     >     marker object
>> >     >     >     >   that has has a "generation number" of the metadata
>> >     >     embedded in its
>> >     >     >     >   name.
>> >     >     >     >
>> >     >     >     >   I'm not yet sure if this would work with swift as
>> well
>> >     >     (the google
>> >     >     >     >   storage backend just relies on the strong
>> >     read-after-write
>> >     >     >     >   consistency).
>> >     >     >     >
>> >     >     >     > - The file system checker needs a way to identify lost
>> >     objects.
>> >     >     >     >
>> >     >     >     >   Here the S3 backend just relies on the durability
>> >     >     guarantee that
>> >     >     >     >   effectively no object will ever be lost.
>> >     >     >     >
>> >     >     >     >   Again, I'm not sure how to implement this for swift.
>> >     >     >     >
>> >     >     >     >
>> >     >     >     > Any suggestions?
>> >     >     >     >
>> >     >     >     >
>> >     >     >     >
>> >     >     >     > Best,
>> >     >     >     >
>> >     >     >     >    -Nikolaus
>> >     >     >     >
>> >     >     >
>> >     >     >
>> >     >     >       -Nikolaus
>> >     >     >
>> >     >     >     --
>> >     >     >      »Time flies like an arrow, fruit flies like a Banana.«
>> >     >     >
>> >     >     >      PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD
>> B7F8
>> >     >     AE4E 425C
>> >     >     >
>> >     >     >     _______________________________________________
>> >     >     >     Mailing list: https://launchpad.net/~openstack
>> >     >     >     Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
>> >     >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx
>> >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx>>
>> >     >     >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx
>> >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
>> >     >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx
>> >     <mailto:openstack@xxxxxxxxxxxxxxxxxxx>>>
>> >     >     >     Unsubscribe : https://launchpad.net/~openstack
>> >     >     >     More help   : https://help.launchpad.net/ListHelp
>> >     >     >
>> >     >     >
>> >     >
>> >     >
>> >     >       -Nikolaus
>> >     >
>> >     >     --
>> >     >      »Time flies like an arrow, fruit flies like a Banana.«
>> >     >
>> >     >      PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8
>> >     AE4E 425C
>> >     >
>> >     >
>> >
>> >
>> >       -Nikolaus
>> >
>> >     --
>> >      »Time flies like an arrow, fruit flies like a Banana.«
>> >
>> >      PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
>> >
>> >
>>
>>
>>   -Nikolaus
>>
>> --
>>  »Time flies like an arrow, fruit flies like a Banana.«
>>
>>  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References