duplicity-team team mailing list archive
-
duplicity-team team
-
Mailing list archive
-
Message #00003
Re: All Merged
(I think you meant to CC the list? Adding CC to response.)
> As to the archive-dir, it was a nice optimization, but its been a real
> support nightmare. I don't ever want to do it that way again. We need
> a directory for persistence when a backup fails, and for any persistent
> data, such as keeping configuration of named backups, etc.
Yes; something like:
~/.duplicity/<backup_name>/cache - cache files, removable at any time
~/.duplicity/<backup_name>/config - backup profile configs, etc
~/.duplicity/<backup_name>/checkpoints - checkpoint info
Or similar.
> You should just be able to use the backend.py ParsedURL and take it from
> pu.netloc.
I ended up augmenting the backend module to have an is_backend_url
alongside get_backend(), to avoid instanting backends just for the
test. Slightly ugly due to duplicate ParsedUrl creation.
> > The reason I ask is that I realized that most people, even though in
> > reality it's not such a great idea, do backups on live file systems,
> > particularly (for obvious reasons) on platforms where file system
> > snapshots are not trivially attainable. If the results of accidentally
> > doing a restart on a live file system are much worse than a regular
> > pause, it can be considered a bit dangerous to enable
> > checkpoint/restart at all except when explicitly enabled by the user.
>
> Hmmm, will have to think about that one. I tend to think of backups
> only when the filesystem is quiescent, but that's just me.
I do to, but in practice I know for a fact most people simply don't do
that. Backing up a live file system is standard practice,
unfortunately. I've had to work pretty hard to convince people it's a
bad idea, even among "techies".
(Case in point, witness the recent "oops we didn't realize a live
tarball of a mysql data directory wasn't safe" disaster(s) of public
site(s)...)
> Part of the problem is that active file systems, especially in Linux,
> don't always tell you that the file underneath has changed, so if you
> backup on an active system, you take your chances.
What rdiff-backup does, if I remember correctly, is to compare mtime
before/after the file was read and rollback if the file was
changed. So, barring strange mtime stuff, you always get either a
consistent copy of that particular file (of course NOT of a directory
tree) or no file at all (in that increment).
But it's definitely a designed-for-case that gets logged.
> I cannot imagine
> restart causing any problems for the filesystem, but if you're running
> on cron and not checking the logs, then a failure on Monday may lead to
> a restart on Tuesday.
I mostly meant in the sense of what the end results are internally in
duplicity; for example if resumption after modification were to cause
duplicity to emit a broken/corrupt backup because of some kind of
mismatch between already written tar data and the would-be position in
the would-be tar stream being out of synch (again sorry, I haven't
looked at how restart really works exactly).
> Assuming that completes, you're mostly protected,
> but part of your backup will be from Monday and part from Tuesday.
That part is fine and expected; you can't really demand anything else
if you do this on live file systems...
--
/ Peter Schuller
PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@xxxxxxxxxxxx>'
Key retrieval: Send an E-Mail to getpgpkey@xxxxxxxxx
E-Mail: peter.schuller@xxxxxxxxxxxx Web: http://www.scode.org
Attachment:
pgphYZSl65I9T.pgp
Description: PGP signature
Follow ups
References