openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #06851
Re: Swift Consistency Guarantees?
By default there are 3 replicas.
A PUT Object will return after 2 replicas are done.
So if all nodes are up then there are at least 2 replicas.
If all replica nodes are down, then the GET Object will fail.
On Fri, Jan 20, 2012 at 11:21 AM, Nikolaus Rath <Nikolaus@xxxxxxxx> wrote:
> Hi,
>
> So if an object update has not yet been replicated on all nodes, and all
> nodes that have been updated are offline, what will happen? Will swift
> recognize this and give me an error, or will it silently return the
> older version?
>
> Thanks,
> Nikolaus
>
>
> On 01/20/2012 02:14 PM, Stephen Broeker wrote:
> > If a node is down, then it is ignored.
> > That is the whole point about 3 replicas.
> >
> > On Fri, Jan 20, 2012 at 10:43 AM, Nikolaus Rath <Nikolaus@xxxxxxxx
> > <mailto:Nikolaus@xxxxxxxx>> wrote:
> >
> > Hi,
> >
> > What happens if one of the nodes is down? Especially if that node
> holds
> > the newest copy?
> >
> > Thanks,
> > Nikolaus
> >
> > On 01/20/2012 12:33 PM, Stephen Broeker wrote:
> > > The X-Newest header can be used by a GET Operation to ensure that
> > all of the
> > > Storage Nodes (3 by default) are queried for the latest copy of
> > the Object.
> > > The COPY Object operation already has this functionality.
> > >
> > > On Fri, Jan 20, 2012 at 9:12 AM, Nikolaus Rath <Nikolaus@xxxxxxxx
> > <mailto:Nikolaus@xxxxxxxx>
> > > <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>> wrote:
> > >
> > > Hi,
> > >
> > > No one able to further clarify this?
> > >
> > > Does swift offer there read-after-create consistence like
> > > non-us-standard S3? What are the precise syntax and semantics
> of
> > > X-Newest header?
> > >
> > > Best,
> > > Nikolaus
> > >
> > >
> > > On 01/18/2012 10:15 AM, Nikolaus Rath wrote:
> > > > Michael Barton <mike-launchpad@xxxxxxxxxxxxxxxx
> > <mailto:mike-launchpad@xxxxxxxxxxxxxxxx>
> > > <mailto:mike-launchpad@xxxxxxxxxxxxxxxx
> > <mailto:mike-launchpad@xxxxxxxxxxxxxxxx>>> writes:
> > > >> On Tue, Jan 17, 2012 at 4:55 PM, Nikolaus Rath
> > <Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>
> > > <mailto:Nikolaus@xxxxxxxx <mailto:Nikolaus@xxxxxxxx>>> wrote:
> > > >>> Amazon S3 and Google Storage make very explicit (non-)
> > consistency
> > > >>> guarantees for stored objects. I'm looking for a similar
> > > documentation
> > > >>> about OpenStack's Swift, but haven't had much success.
> > > >>
> > > >> I don't think there's any documentation on this, but it
> would
> > > probably
> > > >> be good to write up. Consistency in Swift is very similar
> > to S3.
> > > >> That is, there aren't many non-eventual consistency
> guarantees.
> > > >>
> > > >> Listing updates can happen asynchronously (especially under
> > > load), and
> > > >> older versions of files can show up in requests (deletes
> > are just a
> > > >> new "deleted" version of the file).
> > > >
> > > > Ah, ok. Thanks a lot for stating this so explicitly. There
> seems
> > > to be a
> > > > lot of confusion about this, now I can at least point people
> to
> > > > something.
> > > >
> > > >> Swift can generally be relied on for read-after-write
> > consistency,
> > > >> like S3's regions other than the the US Standard region.
> > The reason
> > > >> S3 in US Standard doesn't have this guarantee is because
> > it's more
> > > >> geographically widespread - something Swift isn't good at
> > yet. I can
> > > >> imagine we'll have the same limitation when we get there.
> > > >
> > > > Do you mean read-after-create consistency? Because below you
> > say about
> > > > read-after-write:
> > > >
> > > >>> - If I receive a (non-error) response to a PUT request, am
> I
> > > guaranteed
> > > >>> that the object will be immediately included in all object
> > > listings in
> > > >>> every possible situation?
> > > >>
> > > >> Nope.
> > > >
> > > > ..so is there such a guarantee for PUTs of *new* objects
> > (like S3 non
> > > > us-classic), or does "can generally be relied on" just mean
> > that the
> > > > chances for new puts are better?
> > > >
> > > >> Also like S3, Swift can't make any strong guarantees about
> > > >> read-after-update or read-after-delete consistency. We do
> > have an
> > > >> "X-Newest" header that can be added to GETs and HEADs to
> > make the
> > > >> proxy do a quorum of backend servers and return the newest
> > available
> > > >> version, which greatly improves these, at the cost of
> latency.
> > > >
> > > > That sounds very interesting. Could you give some more
> > details on what
> > > > exactly is guaranteed when using this header? What happens
> > if the
> > > server
> > > > having the newest copy is down?
> > > >
> > > >>> - If the swift server looses an object, will the object
> name
> > > still be
> > > >>> returned in object listings? Will attempts to retrieve it
> > result
> > > in 404
> > > >>> errors (as if it never existed) or a different error?
> > > >>
> > > >> It will show up in listings, but give a 404 when you
> attempt to
> > > >> retrieve it. I'm not sure how we can improve that with
> Swift's
> > > >> general model, but feel free to make suggestions.
> > > >
> > > > From an application programmers point of view, it would be
> very
> > > helpful
> > > > if lost objects could be distinguished from non-existing
> > object by a
> > > > different HTTP error. Trying to access a non-existing object
> may
> > > > indicate a bug in the application, so it would be nice to
> > know when it
> > > > happens.
> > > >
> > > > Also, it would be very helpful if there was a way to list
> > all lost
> > > > objects without having to issue HEAD requests for every
> > stored object.
> > > > Could this information be added to the XML and JSON output of
> > > container
> > > > listings? Then an application would have the chance to
> > periodically
> > > > check for lost data, rather than having to handle all lost
> > objects at
> > > > the instant they're required.
> > > >
> > > >
> > > > I am working on a swift backend for S3QL
> > > > (http://code.google.com/p/s3ql/), a program that exposes
> > online cloud
> > > > storage as a local UNIX file system. To prevent data
> > corruption, there
> > > > are two requirements that I'm currently struggling to
> > provide with the
> > > > swift backend:
> > > >
> > > > - There needs to be a way to reliably check if one object
> > (holding the
> > > > file system metadata) is the newest version.
> > > >
> > > > The S3 backend does this by requiring storage in the non
> > us-classic
> > > > regions and using list-after-create consistency with a
> > marker object
> > > > that has has a "generation number" of the metadata
> > embedded in its
> > > > name.
> > > >
> > > > I'm not yet sure if this would work with swift as well
> > (the google
> > > > storage backend just relies on the strong read-after-write
> > > > consistency).
> > > >
> > > > - The file system checker needs a way to identify lost
> objects.
> > > >
> > > > Here the S3 backend just relies on the durability
> > guarantee that
> > > > effectively no object will ever be lost.
> > > >
> > > > Again, I'm not sure how to implement this for swift.
> > > >
> > > >
> > > > Any suggestions?
> > > >
> > > >
> > > >
> > > > Best,
> > > >
> > > > -Nikolaus
> > > >
> > >
> > >
> > > -Nikolaus
> > >
> > > --
> > > »Time flies like an arrow, fruit flies like a Banana.«
> > >
> > > PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8
> > AE4E 425C
> > >
> > > _______________________________________________
> > > Mailing list: https://launchpad.net/~openstack
> > > Post to : openstack@xxxxxxxxxxxxxxxxxxx
> > <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> > > <mailto:openstack@xxxxxxxxxxxxxxxxxxx
> > <mailto:openstack@xxxxxxxxxxxxxxxxxxx>>
> > > Unsubscribe : https://launchpad.net/~openstack
> > > More help : https://help.launchpad.net/ListHelp
> > >
> > >
> >
> >
> > -Nikolaus
> >
> > --
> > »Time flies like an arrow, fruit flies like a Banana.«
> >
> > PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
> >
> >
>
>
> -Nikolaus
>
> --
> »Time flies like an arrow, fruit flies like a Banana.«
>
> PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
>
Follow ups
References