← Back to team overview

syncany-team team mailing list archive

Re: Suggestion for Syncany

 

Seems like an excellent idea! The real question is how much redundancy are
you willing to trade for recovery possibilities (this could even be
configurable). There are a lot of "brittle" storage backends that when
enhanced by some parity could produce a relatively solid backend store.

The real complication with such systems is always the failover scenarios.
How do we handle a "degraded" storage array? At that point there is no
redundancy and the system must work quickly to rebuild the redunancy within
the remaining storage backends? Do we alert the user to the state of their
array and leave it at that?

I really like the idea and would love to help, but I don't want to take on
more than we're able to. It might be useful to create a blueprint for this
and start creating milestones within the trunk series related to the things
that are priority. For now, it's getting syncing working. Later, it might be
parity. After that, who knows? But at least it would give all the developers
on the project some direction as to what is the focus for development at
that moment.

Great idea David!

Thanks,
Stefan

On Fri, Jun 3, 2011 at 3:44 PM, Philipp Heckel <philipp.heckel@xxxxxxxxx>wrote:

> Hallo,
>
> I apologize for my late e-mail but I had a lot going on in the last few
> days. Since I think this might be of interest to other developers, I CC'd
> this mail to the mailing list.
>
> Up until now, I was not aware that parity files exit, but now I read the
> Wikipedia article and even tried out the par2 Linux tool. It think this is a
> brilliant idea!! If I understand it correctly, this outshines the
> RAID1-similar idea by far!
>
> If I have 3 different storages, and I'd like to store 10GB, I would create
> parity files for these 10GB with ~34% redundancy (--> 10GB + 3,4 GB parity
> files). I would then split the 10GB in 3,33GB parts and upload them to the
> three storages (together with one third of the parity files), i.e. each
> repository would get 3,33+3,4/3 = 4,46 GB of data. If now one of the
> storages fails, I can still restore 100% of the data. Did I get that right?
>
> If this is correct, I believe it has a huge potential and should be added
> to the blueprint list for future ideas. I would have to be figured how how
> to implement this exactly on the chunk level...
>
> Any ideas on this by the rest of you?
>
> Cheers,
> Philipp
>
>
> On Wed, Jun 1, 2011 at 6:00 PM, David TAILLANDIER <
> david.taillandier@xxxxxxxxxxxxxxx> wrote:
>
>>
>> Dear Mr Heckel,
>>
>> A very famous french Linux website posted some news about your nice
>> software
>> http://linuxfr.org/news/syncany-une-alternative-libre-%C3%A0-dropbox-avec-bien-plus-de-fonctionnalit%C3%A9s
>> Left appart the trolls, Syncany seems to be very welcomed.
>>
>> I dare to expose you something I think could be a good idea to improve
>> Syncany. I don't post it on the mailing list because I don't know if this is
>> welcomed. Feel free to redirect my email to the mailing list, or destroy it,
>> or whatever.
>>
>> For external storage, most users need some kind of security against
>> harware failure, storage provider bankrupt, etc. This is also true for
>> internal storage, although the sources of problems are essentially hardware.
>>
>> A common protection is to replicate datas onto several disks or computers
>> as RAID 1 or RAID 5, RAID 6, etc. With big bandwidth and inexpensive disks,
>> this is okay.
>> But with the Internet, we don't have so big bandwidth. Nor so inexpensive
>> storage space.
>>
>> What about using some adjustable "parity" algorithm as with .PAR files
>> used on newsgroups ?
>> The principle is simple: if you want to lose 25% of your storage space
>> without lossing datas, you need to add 25% of parity datas to the files.
>>
>> Say I have 1000 files, which add up to 1 Gb.
>> Say I don't want to lose any data even if 25% of the total storage is
>> vanished.
>> Say I have 4 different storage providers (or more, but not less for 25%).
>> So I have to convert those 1000 files into .PAR with 25% overhead. Then
>> cut them into 4 (because 4 storage providers). Then upload one of four
>> quarter to each provider. Tedious.
>> Less tedious could be to tar those files before generating the .PAR, but
>> less handy to add/remove/etc one file.
>> Even less tedious is to have a software which do it.
>>
>> With the right algorithm (easy to say, not to do), having RAID 1 is just
>> ajusting the acceptable loss to 50%. Having raid 5 is adjusting to 33.333%.
>> And having RAID 0 is 0%.
>>
>> In my previous job I used this kind of stuff to save 70+ virtual disks
>> from 65 different locations (3.5Tb, with 7+5+14 days history so "virtualy"
>> 91Tb). Each location saved its datas into the 65 other. I used a home-made
>> script for that, and torrents to dispatch. Very efficient.
>> The accepted data loss was adjusted to 10% so we could loss up to 7
>> locations. The cost was only a 10% file size growth.
>> This is an extreme, just for an example I know very well.
>>
>> I think this could be some sort of killer-feature.
>> Or no.
>>
>> If you are interested into this feature, I offer my not-so-helpfull hand.
>> I'm not a real programmer, just a sysadmin, and a user.
>>
>>
>
>
>
> --
> Mailing list: https://launchpad.net/~syncany-team
> Post to     : syncany-team@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~syncany-team
> More help   : https://help.launchpad.net/ListHelp
>
>

References