← Back to team overview

syncany-team team mailing list archive

Re: Suggestion for Syncany



I apologize for my late e-mail but I had a lot going on in the last few
days. Since I think this might be of interest to other developers, I CC'd
this mail to the mailing list.

Up until now, I was not aware that parity files exit, but now I read the
Wikipedia article and even tried out the par2 Linux tool. It think this is a
brilliant idea!! If I understand it correctly, this outshines the
RAID1-similar idea by far!

If I have 3 different storages, and I'd like to store 10GB, I would create
parity files for these 10GB with ~34% redundancy (--> 10GB + 3,4 GB parity
files). I would then split the 10GB in 3,33GB parts and upload them to the
three storages (together with one third of the parity files), i.e. each
repository would get 3,33+3,4/3 = 4,46 GB of data. If now one of the
storages fails, I can still restore 100% of the data. Did I get that right?

If this is correct, I believe it has a huge potential and should be added to
the blueprint list for future ideas. I would have to be figured how how to
implement this exactly on the chunk level...

Any ideas on this by the rest of you?


On Wed, Jun 1, 2011 at 6:00 PM, David TAILLANDIER <
david.taillandier@xxxxxxxxxxxxxxx> wrote:

> Dear Mr Heckel,
> A very famous french Linux website posted some news about your nice
> software
> http://linuxfr.org/news/syncany-une-alternative-libre-%C3%A0-dropbox-avec-bien-plus-de-fonctionnalit%C3%A9s
> Left appart the trolls, Syncany seems to be very welcomed.
> I dare to expose you something I think could be a good idea to improve
> Syncany. I don't post it on the mailing list because I don't know if this is
> welcomed. Feel free to redirect my email to the mailing list, or destroy it,
> or whatever.
> For external storage, most users need some kind of security against harware
> failure, storage provider bankrupt, etc. This is also true for internal
> storage, although the sources of problems are essentially hardware.
> A common protection is to replicate datas onto several disks or computers
> as RAID 1 or RAID 5, RAID 6, etc. With big bandwidth and inexpensive disks,
> this is okay.
> But with the Internet, we don't have so big bandwidth. Nor so inexpensive
> storage space.
> What about using some adjustable "parity" algorithm as with .PAR files used
> on newsgroups ?
> The principle is simple: if you want to lose 25% of your storage space
> without lossing datas, you need to add 25% of parity datas to the files.
> Say I have 1000 files, which add up to 1 Gb.
> Say I don't want to lose any data even if 25% of the total storage is
> vanished.
> Say I have 4 different storage providers (or more, but not less for 25%).
> So I have to convert those 1000 files into .PAR with 25% overhead. Then cut
> them into 4 (because 4 storage providers). Then upload one of four quarter
> to each provider. Tedious.
> Less tedious could be to tar those files before generating the .PAR, but
> less handy to add/remove/etc one file.
> Even less tedious is to have a software which do it.
> With the right algorithm (easy to say, not to do), having RAID 1 is just
> ajusting the acceptable loss to 50%. Having raid 5 is adjusting to 33.333%.
> And having RAID 0 is 0%.
> In my previous job I used this kind of stuff to save 70+ virtual disks from
> 65 different locations (3.5Tb, with 7+5+14 days history so "virtualy" 91Tb).
> Each location saved its datas into the 65 other. I used a home-made script
> for that, and torrents to dispatch. Very efficient.
> The accepted data loss was adjusted to 10% so we could loss up to 7
> locations. The cost was only a 10% file size growth.
> This is an extreme, just for an example I know very well.
> I think this could be some sort of killer-feature.
> Or no.
> If you are interested into this feature, I offer my not-so-helpfull hand.
> I'm not a real programmer, just a sysadmin, and a user.

Follow ups