duplicity-team team mailing list archive

Thread
Date
Re: [Merge] lp:~mgorse/duplicity/0.8-series into lp:duplicity

To: mp+359864@xxxxxxxxxxxxxxxxxx
From: Aaron Whitehouse <lists@xxxxxxxxxxxxxxxxxx>
Date: Tue, 04 Dec 2018 12:35:33 -0000
In-reply-to: <154351826038.6630.12598226748734772334.launchpad@ackee.canonical.com>
Reply-to: mp+359864@xxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx
Many thanks for this Mike. This looks like great work and has been on my todo list for far too long.

I had a quick flick over the first 5k lines of the diff on here and made a couple of comments. 

==

@Ken: 

I cannot actually play with the code now, so would like a bit of time to look into the testing side etc before merge. Mainly I just want to check the testing is all working as it should and the tests are passing.

Assuming the testing is working and passing, there are quite a lot of backends etc that do not have enough automated tests to catch regressions in the Python 3 changes, so I think we should do some decent alpha/beta testing within our dev group before release (on both Python 2 and 3). As we have not yet done a 0.8 series release, I am happy for it to go into the 0.8 trunk and do the testing on that branch -- or we can do it in this/a separate branch if you want to do a 0.8 release before then. 

Diff comments:

> 
> === modified file 'duplicity/__init__.py'
> --- duplicity/__init__.py	2018-10-10 20:25:04 +0000
> +++ duplicity/__init__.py	2018-11-29 19:12:48 +0000
> @@ -19,5 +19,9 @@
>  # along with duplicity; if not, write to the Free Software Foundation,
>  # Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
>  
> +import sys
>  import gettext
> -gettext.install(u'duplicity', unicode=True, names=[u'ngettext'])
> +if sys.version_info.major >= 3:
> +    gettext.install(u'duplicity', names=[u'ngettext'])
> +else:
> +    gettext.install(u'duplicity', names=[u'ngettext'])

Is it a mistake that both of these are the same (or am I going blind)?

> 
> === modified file 'duplicity/diffdir.py'
> --- duplicity/diffdir.py	2018-10-16 20:56:54 +0000
> +++ duplicity/diffdir.py	2018-11-29 19:12:48 +0000
> @@ -126,8 +135,11 @@
>          Callback activated when FileWithSignature read to end
>          """
>          ti.size = len(sig_string)
> -        ti.name = b"signature/" + b"/".join(index)
> -        sigTarFile.addfile(ti, cStringIO.StringIO(sig_string))
> +        if sys.version_info.major >= 3:

Flagging this for myself more than anyone else, as I do not know diffdir well and need to look at it a bit more when I have a minute. I cannot immediately see why there is so much encoding/manipulation etc of ti.name conditional on Python version. Normally in the code paths are path objects and uc_name or name etc are used (with 'surrogateescape' etc).

> +            ti.name = u"signature/" + util.fsdecode(b"/".join(index))
> +        else:
> +            ti.name = b"signature/" + b"/".join(index)
> +        sigTarFile.addfile(ti, io.BytesIO(sig_string))
>  
>      if new_path.isreg() and sig_path and sig_path.isreg() and sig_path.difftype == u"signature":
>          delta_path.difftype = u"diff"
> @@ -481,7 +502,7 @@
>          Make tarblock out of tarinfo and file data
>          """
>          tarinfo.size = len(file_data)
> -        headers = tarinfo.tobuf(errors=u'replace')
> +        headers = tarinfo.tobuf(errors=u'replace', encoding=globals.fsencoding)

Any reason this cannot use util.fsencode?

>          blocks, remainder = divmod(tarinfo.size, tarfile.BLOCKSIZE)  # @UnusedVariable
>          if remainder > 0:
>              filler_data = b"\0" * (tarfile.BLOCKSIZE - remainder)
> 
> === modified file 'duplicity/patchdir.py'
> --- duplicity/patchdir.py	2018-09-24 17:02:42 +0000
> +++ duplicity/patchdir.py	2018-11-29 19:12:48 +0000
> @@ -149,9 +154,11 @@
>  
>  def get_index_from_tarinfo(tarinfo):
>      u"""Return (index, difftype, multivol) pair from tarinfo object"""
> -    for prefix in [b"snapshot/", b"diff/", b"deleted/",
> -                   b"multivol_diff/", b"multivol_snapshot/"]:
> +    for prefix in [u"snapshot/", u"diff/", u"deleted/",
> +                   u"multivol_diff/", u"multivol_snapshot/"]:
>          tiname = util.get_tarinfo_name(tarinfo)
> +        if sys.version_info.major == 2 and isinstance(prefix, unicode):
> +            prefix = prefix.encode()

Why we are wanting to work with UTF-8 internally here rather than unicode?

>          if tiname.startswith(prefix):
>              name = tiname[len(prefix):]  # strip prefix
>              if prefix.startswith(u"multivol"):
> 
> === modified file 'duplicity/selection.py'
> --- duplicity/selection.py	2018-10-07 12:01:41 +0000
> +++ duplicity/selection.py	2018-11-29 19:12:48 +0000
> @@ -241,10 +248,10 @@
>          filelists_index = 0
>          try:
>              for opt, arg in argtuples:
> -                assert isinstance(opt, unicode), u"option " + opt.decode(sys.getfilesystemencoding(), u"ignore") + \
> -                                                 u" is not unicode"
> -                assert isinstance(arg, unicode), u"option " + arg.decode(sys.getfilesystemencoding(), u"ignore") + \
> -                                                 u" is not unicode"
> +                assert isinstance(opt, str), u"option " + opt.decode(sys.getfilesystemencoding(), u"ignore") + \

This should probably use util.fsdecode -- not your change, but we should fix.

> +                                             u" is not unicode"
> +                assert isinstance(arg, str), u"option " + arg.decode(sys.getfilesystemencoding(), u"ignore") + \
> +                                             u" is not unicode"
>  
>                  if opt == u"--exclude":
>                      self.add_selection_func(self.glob_get_sf(arg, 0))


-- 
https://code.launchpad.net/~mgorse/duplicity/0.8-series/+merge/359864
Your team duplicity-team is requested to review the proposed merge of lp:~mgorse/duplicity/0.8-series into lp:duplicity.
References

[Merge] lp:~mgorse/duplicity/0.8-series into lp:duplicity
From: Mgorse, 2018-11-29