← Back to team overview

syncany-team team mailing list archive

Re: Suggestion for Syncany

 

Hallo,

I apologize for my late e-mail but I had a lot going on in the last few
days. Since I think this might be of interest to other developers, I CC'd
this mail to the mailing list.

Up until now, I was not aware that parity files exit, but now I read the
Wikipedia article and even tried out the par2 Linux tool. It think this is a
brilliant idea!! If I understand it correctly, this outshines the
RAID1-similar idea by far!

If I have 3 different storages, and I'd like to store 10GB, I would create
parity files for these 10GB with ~34% redundancy (--> 10GB + 3,4 GB parity
files). I would then split the 10GB in 3,33GB parts and upload them to the
three storages (together with one third of the parity files), i.e. each
repository would get 3,33+3,4/3 = 4,46 GB of data. If now one of the
storages fails, I can still restore 100% of the data. Did I get that right?

If this is correct, I believe it has a huge potential and should be added to
the blueprint list for future ideas. I would have to be figured how how to
implement this exactly on the chunk level...

Any ideas on this by the rest of you?

Cheers,
Philipp


On Wed, Jun 1, 2011 at 6:00 PM, David TAILLANDIER <
david.taillandier@xxxxxxxxxxxxxxx> wrote:

>
> Dear Mr Heckel,
>
> A very famous french Linux website posted some news about your nice
> software
> http://linuxfr.org/news/syncany-une-alternative-libre-%C3%A0-dropbox-avec-bien-plus-de-fonctionnalit%C3%A9s
> Left appart the trolls, Syncany seems to be very welcomed.
>
> I dare to expose you something I think could be a good idea to improve
> Syncany. I don't post it on the mailing list because I don't know if this is
> welcomed. Feel free to redirect my email to the mailing list, or destroy it,
> or whatever.
>
> For external storage, most users need some kind of security against harware
> failure, storage provider bankrupt, etc. This is also true for internal
> storage, although the sources of problems are essentially hardware.
>
> A common protection is to replicate datas onto several disks or computers
> as RAID 1 or RAID 5, RAID 6, etc. With big bandwidth and inexpensive disks,
> this is okay.
> But with the Internet, we don't have so big bandwidth. Nor so inexpensive
> storage space.
>
> What about using some adjustable "parity" algorithm as with .PAR files used
> on newsgroups ?
> The principle is simple: if you want to lose 25% of your storage space
> without lossing datas, you need to add 25% of parity datas to the files.
>
> Say I have 1000 files, which add up to 1 Gb.
> Say I don't want to lose any data even if 25% of the total storage is
> vanished.
> Say I have 4 different storage providers (or more, but not less for 25%).
> So I have to convert those 1000 files into .PAR with 25% overhead. Then cut
> them into 4 (because 4 storage providers). Then upload one of four quarter
> to each provider. Tedious.
> Less tedious could be to tar those files before generating the .PAR, but
> less handy to add/remove/etc one file.
> Even less tedious is to have a software which do it.
>
> With the right algorithm (easy to say, not to do), having RAID 1 is just
> ajusting the acceptable loss to 50%. Having raid 5 is adjusting to 33.333%.
> And having RAID 0 is 0%.
>
> In my previous job I used this kind of stuff to save 70+ virtual disks from
> 65 different locations (3.5Tb, with 7+5+14 days history so "virtualy" 91Tb).
> Each location saved its datas into the 65 other. I used a home-made script
> for that, and torrents to dispatch. Very efficient.
> The accepted data loss was adjusted to 10% so we could loss up to 7
> locations. The cost was only a 10% file size growth.
> This is an extreme, just for an example I know very well.
>
> I think this could be some sort of killer-feature.
> Or no.
>
> If you are interested into this feature, I offer my not-so-helpfull hand.
> I'm not a real programmer, just a sysadmin, and a user.
>
>

Follow ups