← Back to team overview

syncany-team team mailing list archive

Re: Database reconciliation

 

Hi Gregor,

Le 01/12/2013 22:50, Gregor Trefs a écrit :
> From my point of view, your comments are valid and bring up a legal
> point of critism. I also agree about you resolution technique in
> terms of merging two conflicting database versions. However, I am not
> sure about the means (i.e. user involvment) of reaching such a
> reconciled database version.

I was unclear about this point, see my answer to Philipp (next email ;-).

> Under the assumption concurrent actions by client A and B resulted
> into two casual independent database versions A1 and B1, there still
> exists a common ancestor database Revision R0. Further, let us assume
> function diff maps a tuple of database revisions to the set of
> differing multichunk hashs. If diff(A1,R0) and diff(B1,R0) are
> disjoint then Client C is able to resolve the conflict by itself by
> merging both changes diff(A1,R0) and diff(B1,R0) into a new  common
> revision R1.

Absolutely, that's what I'm counting on most concurrent updates.

> After that C can apply it's own changes if diff(A1,R0) is disjoint to
> diff(A1,R0) union diff(B1,R0).

I'm lost here.

> That is a just a different technique to break the tie. Human input
> should only be required, if and only if not enough information is
> given or could be infered to resolve a conflict (convinience ;)).

Right. This is actually what is done at the file level in syncany.

> Further, deleting conflicting database revisions would lead to an
> inconsistent assumption of clients about the remote data. For
> example, client A's local database refelects the state of A1. The
> information about the deletion of this revision is not delivered to 
> A. Thus, A assumes the remote data to have/had a state which
> practically never existed. I think, conflicting revisions should be
> seen as some kind of branches and should remain on the remote side.
> Thus, if A asks for updates, it receives the information, that R1 and
> C1 are casual successors of A1. Do you already have an ideay of
> conflict resolvment of binary data?

As pointed out by Philipp in his answer, as a loser A knows it has to
reverse its changes (hence the dirty database concept), at least in the
current scheme. In my proposal, A will need to detect the conflicts in
the history which seems to be difficult without keeping the branches. So
deleting the branches might be the responsibility of the last conflict
generator. Thanks for pointing that out!

> In my applications I only ever used optimistic locking and never
> dived in this topic beyond this. But your described way seems very
> interesting. The only question which comes up in my mind is, what to
> do in the case the winner never finishes due to, for example, a
> crash?

Well, that's the main issue and as far as I understand the scientific
literature, we are stuck (because we don't have the primitives needed to
do that properly, such as read-update-write). Technically, the winner
scheme I'm proposing is related to optimistic locking in the sense that
you don't wait until having the lock before proceeding to upload the
files. But in the end, prior committing the changes, someone has the
lock. Failure to release it will block the system. So I would resort on
wait and retry technique, associated to a kill command. I think
monitoring the progress of the winner is doable (as it could put in the
lock file the list of the files it plans to upload, for instance). Then
blocked parties can query the storage and if needed still the lock. In
any cases, this is going to be fragile. Any other idea?


Thanks for you input,

Best regards,

Fabrice


Follow ups

References