← Back to team overview

duplicity-team team mailing list archive

Re: [Question #248111]: Duplicity Adds the same file again after full backup

 

Question #248111 on Duplicity changed:
https://answers.launchpad.net/duplicity/+question/248111

    Status: Answered => Open

Rodrigo Alvarez is still having a problem:
Hi Edso,

Thanks for your reply.  Indeed the behavior you describe is what I
expect, so I took your advice and ran with -v9 in an attempt to
reproduce the anomaly with greater visibility.   Overall, it seems that
although on the 1st pass duplicity states that it added a set of files
(~53000), during the second pass it compares them against "None", as if
it had not really added them.   Details of what I did and what I
observed follow.

I clear the backend of all duplicity volumes and signature files

1st Pass
I start a new backup of my ~500GB photo library.  Duplicity recognizes there are no previous backups and starts adding all files. Duplicity runs for about ~14 hrs and finishes with the following statistics: 
--------------[ Backup Statistics ]--------------
StartTime 1399264740.37 (Sun May  4 21:39:00 2014)
EndTime 1399315243.63 (Mon May  5 11:40:43 2014)
ElapsedTime 50503.26 (14 hours 1 minute 43.26 seconds)
SourceFiles 146166
SourceFileSize 534825699131 (498 GB)
NewFiles 146166
NewFileSize 534825699131 (498 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 146166
RawDeltaSize 534812567355 (498 GB)
TotalDestinationSizeChange 517724175942 (482 GB)
Errors 0
-------------------------------------------------

2nd Pass
I rerun duplicity, now expecting no incremental data to be added because the source files have not changed, yet in the end duplicity returns the following statistics indicating a new ~183 GB. I see this as a set of incremental volumes on the backend. 

--------------[ Backup Statistics ]--------------
StartTime 1399362663.27 (Tue May  6 00:51:03 2014)
EndTime 1399378166.41 (Tue May  6 05:09:26 2014)
ElapsedTime 15503.14 (4 hours 18 minutes 23.14 seconds)
SourceFiles 146166
SourceFileSize 534825699131 (498 GB)
NewFiles 53631
NewFileSize 196565375491 (183 GB)
DeletedFiles 0
ChangedFiles 4
ChangedFileSize 20463616 (19.5 MB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 53635
RawDeltaSize 196559616928 (183 GB)
TotalDestinationSizeChange 189125153141 (176 GB)
Errors 0

3rd pass
I rerun duplicity one 3rd time, and this time I see the behavior I expected on the second pass, which is not to add any incremental. 
--------------[ Backup Statistics ]--------------
StartTime 1399381492.80 (Tue May  6 06:04:52 2014)
EndTime 1399381602.38 (Tue May  6 06:06:42 2014)
ElapsedTime 109.58 (1 minute 49.58 seconds)
SourceFiles 146166
SourceFileSize 534825699131 (498 GB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
ChangedFiles 4
ChangedFileSize 20463616 (19.5 MB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 4
RawDeltaSize 72022 (70.3 KB)
TotalDestinationSizeChange 4978 (4.86 KB)
Errors 0

Probe 2nd pass logs to see what was added
Now I analyze the log files (written with -v9) to see which files were added on  the 2nd pass. 
$ pass1log=duplicity_home_to_primary_bakup_2014_05_04-213859.log
$ pass2log=duplicity_home_to_primary_bakup_2014_05_06-005101.log
$ pass3log=duplicity_home_to_primary_bakup_2014_05_06-060450.log
$ cat  ${pass2log} | grep "^A " |  head
A Masters/Rolls/2010/10/_MG_0246.CR2
A Masters/Rolls/2010/10/_MG_0247.CR2
A Masters/Rolls/2010/10/_MG_0248.CR2
A Masters/Rolls/2010/10/_MG_0249.CR2
… and another some 53631 more files…

Probe 1st pass logs to see if re-added files where there
I probe the 1st pass log, focusing on the first file that was added on the second pass (even though it was there on the 1st pass).  On the first pass I see that this file was Selected, Compared to None, Got Delta,  Added and that the volume was written. 
$ fileOfInterest="Masters/Rolls/2010/10/_MG_0246.CR2"
$ grep -A 400 "${fileOfInterest}" ${pass1log}
Selecting /home/Photos/Masters/Rolls/2010/10/_MG_0246.CR2
Comparing ('Masters', 'Rolls', '2010', '10', '_MG_0246.CR2') and None
Getting delta of (('Masters', 'Rolls', '2010', '10', '_MG_0246.CR2') /home/Photos/Masters/Rolls/2010/10/_MG_0246.CR2 reg) and None
A Masters/Rolls/2010/10/_MG_0246.CR2
…. A few lines lower I see the volume was sent to be written.
AsyncScheduler: scheduling task for asynchronous execution
AsyncScheduler: active workers = 1
Processed volume 1254
Writing /media/primary_backup/raid_bkup/Photos/duplicity-full.20140505T043900Z.vol1254.difftar.gpg
Registering (mktemp) temporary file /home/Backups/.duplicity_tmp/duplicity-ATyk1O-tempdir/mktemp-wHBzYl-1256
… and a few lines later I see the write succeeded. 
Deleting /home/Backups/.duplicity_tmp/duplicity-ATyk1O-tempdir/mktemp-cK6Ccl-1255
Forgetting temporary file /home/Backups/.duplicity_tmp/duplicity-ATyk1O-tempdir/mktemp-cK6Ccl-1255
AsyncScheduler: task execution done (success: True)
AsyncScheduler: active workers = 0

Probe 2nd pass logs to see why files where re-added
I probe the 2nd pass log the same way and I see that the same file, was Selected, Compared to NONE <- This doesn't make sense! Added, etc.    It seems that for some reason, duplicity thinks this file was NOT added on the 1st pass. 
$ grep -A 5 "${fileOfInterest}" ${pass2log}
Selecting /home/Photos/Masters/Rolls/2010/10/_MG_0246.CR2
Comparing ('Masters', 'Rolls', '2010', '10', '_MG_0246.CR2') and None
Getting delta of (('Masters', 'Rolls', '2010', '10', '_MG_0246.CR2') /home/Photos/Masters/Rolls/2010/10/_MG_0246.CR2 reg) and None
A Masters/Rolls/2010/10/_MG_0246.CR2

Probe 3rd pass log to see why files were not re-added (as expected). 
Finally, I probe the 3rd pass log for the same file and I see that the file was Selected and Compared to its previous version. 
grep -A 5 "${fileOfInterest}" ${pass3log}
Selecting /home/Photos/Masters/Rolls/2010/10/_MG_0246.CR2
Comparing ('Masters', 'Rolls', '2010', '10', '_MG_0246.CR2') and ('Masters', 'Rolls', '2010', '10', '_MG_0246.CR2')

It appears that for some reason duplicity did not add this file (or the
other +53000) thousand files to its record on the first pass (although
it did add data for this.)    Can you think of any reason why this would
be the case?   I'm running duplicity as root, could this be permission
issues?

As you will note, it is somewhat unsatisfying that the 2nd backup
consumes an unnecessary ~183GB of space.   I've uploaded the full logs
for the three runs to
http://sira.duckdns.org/temp//duplicity_logs_photos_bakup_2014_05.tar.gz
in case you want to probe them deeper.

Thank you for your consideration!

-- 
You received this question notification because you are a member of
duplicity-team, which is an answer contact for Duplicity.