← Back to team overview

openstack team mailing list archive

Re: Swift Consistency Guarantees?

 

The X-Newest header can be used by a GET Operation to ensure that all of the
Storage Nodes (3 by default) are queried for the latest copy of the Object.
The COPY Object operation already has this functionality.

On Fri, Jan 20, 2012 at 9:12 AM, Nikolaus Rath <Nikolaus@xxxxxxxx> wrote:

> Hi,
>
> No one able to further clarify this?
>
> Does swift offer there read-after-create consistence like
> non-us-standard S3? What are the precise syntax and semantics of
> X-Newest header?
>
> Best,
> Nikolaus
>
>
> On 01/18/2012 10:15 AM, Nikolaus Rath wrote:
> > Michael Barton <mike-launchpad@xxxxxxxxxxxxxxxx> writes:
> >> On Tue, Jan 17, 2012 at 4:55 PM, Nikolaus Rath <Nikolaus@xxxxxxxx>
> wrote:
> >>> Amazon S3 and Google Storage make very explicit (non-) consistency
> >>> guarantees for stored objects. I'm looking for a similar documentation
> >>> about OpenStack's Swift, but haven't had much success.
> >>
> >> I don't think there's any documentation on this, but it would probably
> >> be good to write up.  Consistency in Swift is very similar to S3.
> >> That is, there aren't many non-eventual consistency guarantees.
> >>
> >> Listing updates can happen asynchronously (especially under load), and
> >> older versions of files can show up in requests (deletes are just a
> >> new "deleted" version of the file).
> >
> > Ah, ok. Thanks a lot for stating this so explicitly. There seems to be a
> > lot of confusion about this, now I can at least point people to
> > something.
> >
> >> Swift can generally be relied on for read-after-write consistency,
> >> like S3's regions other than the the US Standard region.  The reason
> >> S3 in US Standard doesn't have this guarantee is because it's more
> >> geographically widespread - something Swift isn't good at yet.  I can
> >> imagine we'll have the same limitation when we get there.
> >
> > Do you mean read-after-create consistency? Because below you say about
> > read-after-write:
> >
> >>> - If I receive a (non-error) response to a PUT request, am I guaranteed
> >>> that the object will be immediately included in all object listings in
> >>> every possible situation?
> >>
> >> Nope.
> >
> > ..so is there such a guarantee for PUTs of *new* objects (like S3 non
> > us-classic), or does "can generally be relied on" just mean that the
> > chances for new puts are better?
> >
> >> Also like S3, Swift can't make any strong guarantees about
> >> read-after-update or read-after-delete consistency.  We do have an
> >> "X-Newest" header that can be added to GETs and HEADs to make the
> >> proxy do a quorum of backend servers and return the newest available
> >> version, which greatly improves these, at the cost of latency.
> >
> > That sounds very interesting. Could you give some more details on what
> > exactly is guaranteed when using this header? What happens if the server
> > having the newest copy is down?
> >
> >>> - If the swift server looses an object, will the object name still be
> >>> returned in object listings? Will attempts to retrieve it result in 404
> >>> errors (as if it never existed) or a different error?
> >>
> >> It will show up in listings, but give a 404 when you attempt to
> >> retrieve it.  I'm not sure how we can improve that with Swift's
> >> general model, but feel free to make suggestions.
> >
> > From an application programmers point of view, it would be very helpful
> > if lost objects could be distinguished from non-existing object by a
> > different HTTP error. Trying to access a non-existing object may
> > indicate a bug in the application, so it would be nice to know when it
> > happens.
> >
> > Also, it would be very helpful if there was a way to list all lost
> > objects without having to issue HEAD requests for every stored object.
> > Could this information be added to the XML and JSON output of container
> > listings? Then an application would have the chance to periodically
> > check for lost data, rather than having to handle all lost objects at
> > the instant they're required.
> >
> >
> > I am working on a swift backend for S3QL
> > (http://code.google.com/p/s3ql/), a program that exposes online cloud
> > storage as a local UNIX file system. To prevent data corruption, there
> > are two requirements that I'm currently struggling to provide with the
> > swift backend:
> >
> > - There needs to be a way to reliably check if one object (holding the
> >   file system metadata) is the newest version.
> >
> >   The S3 backend does this by requiring storage in the non us-classic
> >   regions and using list-after-create consistency with a marker object
> >   that has has a "generation number" of the metadata embedded in its
> >   name.
> >
> >   I'm not yet sure if this would work with swift as well (the google
> >   storage backend just relies on the strong read-after-write
> >   consistency).
> >
> > - The file system checker needs a way to identify lost objects.
> >
> >   Here the S3 backend just relies on the durability guarantee that
> >   effectively no object will ever be lost.
> >
> >   Again, I'm not sure how to implement this for swift.
> >
> >
> > Any suggestions?
> >
> >
> >
> > Best,
> >
> >    -Nikolaus
> >
>
>
>   -Nikolaus
>
> --
>  »Time flies like an arrow, fruit flies like a Banana.«
>
>  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References