← Back to team overview

duplicity-team team mailing list archive

[Merge] lp:~aaron-whitehouse/duplicity/Bug_1624725_files_within_folder_slash into lp:duplicity/0.7-series

 

Aaron Whitehouse has proposed merging lp:~aaron-whitehouse/duplicity/Bug_1624725_files_within_folder_slash into lp:duplicity/0.7-series.

Commit message:
Fixed Bug #1624725, so that an include glob ending in "/" now includes folder contents (for globs with and without special characters). This preserves the behaviour that an expression ending in "/" only matches a folder, but now the contents of any matching folder is included.

Requested reviews:
  duplicity-team (duplicity-team)
Related bugs:
  Bug #1624725 in Duplicity: "Including a folder ending in "/" does not include folder contents"
  https://bugs.launchpad.net/duplicity/+bug/1624725

For more details, see:
https://code.launchpad.net/~aaron-whitehouse/duplicity/Bug_1624725_files_within_folder_slash/+merge/311357

Fixed Bug #1624725, so that an include glob ending in "/" now includes folder contents (for globs with and without special characters). This preserves the behaviour that an expression ending in "/" only matches a folder, but now the contents of any matching folder is included.

Note that this is to be merged into 0.7-series, as this fixes a regression in recent 0.7 releases. The change (likely along with all other changes that have been made in 0.7 recently) could then be merged into 0.8. Once merged into the 0.8 series, I plan to converge the functions dealing with expressions the do and don't have globbing characters so that we only have one code base to check/maintain.


-- 
Your team duplicity-team is requested to review the proposed merge of lp:~aaron-whitehouse/duplicity/Bug_1624725_files_within_folder_slash into lp:duplicity/0.7-series.
=== modified file 'duplicity/globmatch.py'
--- duplicity/globmatch.py	2016-08-12 20:50:29 +0000
+++ duplicity/globmatch.py	2016-11-20 22:26:00 +0000
@@ -60,11 +60,10 @@
     2 - if the folder should be scanned for any included/excluded files
     None - if the selection function has nothing to say about the file
     """
-    match_only_dirs = False
+    glob_ends_w_slash = False
 
-    # ToDo: Test behaviour of "/" on its own - should always match
     if glob_str != "/" and glob_str[-1] == "/":
-        match_only_dirs = True
+        glob_ends_w_slash = True
         # Remove trailing / from directory name (unless that is the entire
         # string)
         glob_str = glob_str[:-1]
@@ -76,21 +75,51 @@
     re_comp = lambda r: re.compile(r, re.S | flags)
 
     # matches what glob matches and any files in directory
+    # Resulting regular expression is:
+    # ^ string must be at the beginning of path
+    # string translated into regex
+    # ($|/) nothing must follow except for the end of the string, newline or /
+    # Note that the "/" at the end of the regex means that it will match
+    # if the glob matches a parent folders of path
     glob_comp_re = re_comp("^%s($|/)" % glob_to_regex(glob_str))
 
+    if glob_ends_w_slash:
+        # Creates a version of glob_comp_re that does not match folder contents
+        # This can be used later to check that an exact match is actually a
+        # folder, rather than a file.
+        glob_comp_re_exact = re_comp("^%s($)" % glob_to_regex(glob_str))
+
     if glob_str.find("**") != -1:
+        # glob_str has a ** in it
         glob_str = glob_str[:glob_str.find("**") + 2]  # truncate after **
 
+    # Below regex is translates to:
+    # ^ string must be at the beginning of path
+    # the regexs corresponding to the parent directories of glob_str
+    # $ nothing must follow except for the end of the string or newline
     scan_comp_re = re_comp("^(%s)$" %
                            "|".join(_glob_get_prefix_regexs(glob_str)))
 
     def test_fn(path):
+        if glob_comp_re.match(path.name):
+            # Path matches glob, or is contained within a matching folder
+            if not glob_ends_w_slash:
+                return include
+            else:
+                # Glob ended with a /, so we need to check any exact match was
+                # a folder
+                if glob_comp_re_exact.match(path.name):
+                    # Not an included file/folder, so must be a folder to match
+                    if path.isdir():
+                        # Is a directory, so all is well
+                        return include
+                    else:
+                        # Exact match and not a folder
+                        return None
+                else:
+                    # An included file/folder, so normal approach is fine
+                    return include
 
-        if match_only_dirs and not path.isdir():
-            # If the glob ended with a /, only match directories
-            return None
-        elif glob_comp_re.match(path.name):
-            return include
         elif include == 1 and scan_comp_re.match(path.name):
             return 2
         else:

=== modified file 'duplicity/selection.py'
--- duplicity/selection.py	2016-09-04 22:25:37 +0000
+++ duplicity/selection.py	2016-11-20 22:26:00 +0000
@@ -487,6 +487,9 @@
 
         """
         # Internal. Used by glob_get_sf and unit tests.
+        # ToDo: Make all globbing/non-globbing use same code path
+        # This distinction has bitten us too many times with bugs in one or
+        # the other.
         match_only_dirs = False
 
         if filename != "/" and filename[-1] == "/":
@@ -506,8 +509,10 @@
         # Internal. Used by glob_get_filename_sf.
 
         def include_sel_func(path):
-            if match_only_dirs and not path.isdir():
-                # If the glob ended with a /, only match directories
+            if len(tuple) == len(path.index) and match_only_dirs and not path.isdir():
+                # If we are assessing the actual directory (rather than the
+                # contents of the directory) and the glob ended with a /,
+                # only match directories
                 return None
             elif (path.index == tuple[:len(path.index)] or
                     path.index[:len(tuple)] == tuple):

=== modified file 'testing/functional/test_selection.py'
--- testing/functional/test_selection.py	2016-09-04 22:25:37 +0000
+++ testing/functional/test_selection.py	2016-11-20 22:26:00 +0000
@@ -19,7 +19,6 @@
 # Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 
 import os
-import os.path
 import unittest
 
 from . import FunctionalTestCase
@@ -1006,5 +1005,81 @@
                                     ['1sub1sub1_file.txt']])
 
 
+class TestFolderIncludesFiles(IncludeExcludeFunctionalTest):
+    """ This tests that including a folder includes the files within it"""
+    # https://bugs.launchpad.net/duplicity/+bug/1624725
+
+    def test_includes_files(self):
+        """This tests that including a folder includes the files within it"""
+        self.backup("full", "testfiles/select2/1/1sub1",
+                    options=["--include", "testfiles/select2/1/1sub1/1sub1sub1",
+                             "--exclude", "**"])
+        self.restore()
+        restore_dir = 'testfiles/restore_out'
+        restored = self.directory_tree_to_list_of_lists(restore_dir)
+        self.assertEqual(restored, [['1sub1sub1'],
+                                    ['1sub1sub1_file.txt']])
+
+    def test_includes_files_trailing_slash(self):
+        """This tests that including a folder includes the files within it"""
+        self.backup("full", "testfiles/select2/1/1sub1",
+                    options=["--include", "testfiles/select2/1/1sub1/1sub1sub1/",
+                             "--exclude", "**"])
+        self.restore()
+        restore_dir = 'testfiles/restore_out'
+        restored = self.directory_tree_to_list_of_lists(restore_dir)
+        self.assertEqual(restored, [['1sub1sub1'],
+                                    ['1sub1sub1_file.txt']])
+
+    def test_includes_files_trailing_slash_globbing_chars(self):
+        """Tests folder includes with globbing char and /"""
+        self.backup("full", "testfiles/select2/1/1sub1",
+                    options=["--include", "testfiles/s?lect2/1/1sub1/1sub1sub1/",
+                             "--exclude", "**"])
+        self.restore()
+        restore_dir = 'testfiles/restore_out'
+        restored = self.directory_tree_to_list_of_lists(restore_dir)
+        self.assertEqual(restored, [['1sub1sub1'],
+                                    ['1sub1sub1_file.txt']])
+
+    def test_excludes_files_no_trailing_slash(self):
+        """This tests that excluding a folder excludes the files within it"""
+        self.backup("full", "testfiles/select2/1/1sub1",
+                    options=["--exclude", "testfiles/select2/1/1sub1/1sub1sub1",
+                             "--exclude", "testfiles/select2/1/1sub1/1sub1sub2",
+                             "--exclude", "testfiles/select2/1/1sub1/1sub1sub3",
+                             "--include", "testfiles/select2/1/1sub1/1sub1**",
+                             "--exclude", "testfiles/select2/1/1sub1/irrelevant.txt"])
+        self.restore()
+        restore_dir = 'testfiles/restore_out'
+        restored = self.directory_tree_to_list_of_lists(restore_dir)
+        self.assertEqual(restored, [])
+
+    def test_excludes_files_trailing_slash(self):
+        """Excluding a folder excludes the files within it, if ends with /"""
+        self.backup("full", "testfiles/select2/1/1sub1",
+                    options=["--exclude", "testfiles/select2/1/1sub1/1sub1sub1/",
+                             "--exclude", "testfiles/select2/1/1sub1/1sub1sub2/",
+                             "--exclude", "testfiles/select2/1/1sub1/1sub1sub3/",
+                             "--include", "testfiles/select2/1/1sub1/1sub1**",
+                             "--exclude", "testfiles/select2/1/1sub1/irrelevant.txt"])
+        self.restore()
+        restore_dir = 'testfiles/restore_out'
+        restored = self.directory_tree_to_list_of_lists(restore_dir)
+        self.assertEqual(restored, [])
+
+    def test_excludes_files_trailing_slash_globbing_chars(self):
+        """Tests folder excludes with globbing char and /"""
+        self.backup("full", "testfiles/select2/1/1sub1",
+                    options=["--exclude", "testfiles/sel?ct2/1/1sub1/1sub1sub1/",
+                             "--exclude", "testfiles/sel[e,f]ct2/1/1sub1/1sub1sub2/",
+                             "--exclude", "testfiles/sel*t2/1/1sub1/1sub1sub3/",
+                             "--include", "testfiles/select2/1/1sub1/1sub1**",
+                             "--exclude", "testfiles/select2/1/1sub1/irrelevant.txt"])
+        self.restore()
+        restore_dir = 'testfiles/restore_out'
+        restored = self.directory_tree_to_list_of_lists(restore_dir)
+        self.assertEqual(restored, [])
+
 if __name__ == "__main__":
     unittest.main()

=== modified file 'testing/unit/test_globmatch.py'
--- testing/unit/test_globmatch.py	2016-03-05 10:47:02 +0000
+++ testing/unit/test_globmatch.py	2016-11-20 22:26:00 +0000
@@ -20,9 +20,12 @@
 # along with duplicity; if not, write to the Free Software Foundation,
 # Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 
-import unittest
+import sys
 from duplicity.globmatch import *
+from duplicity.path import *
 from . import UnitTestCase
+from mock import patch
+import unittest
 
 
 class MatchingTest(UnitTestCase):
@@ -38,4 +41,214 @@
         assert r == "\\/usr\\/[^/]*\\/bin\\/", r
         assert glob_to_regex("[a.b/c]") == "[a.b/c]"
         r = glob_to_regex("[a*b-c]e[!]]")
-        assert r == "[a*b-c]e[^]]", r
\ No newline at end of file
+        assert r == "[a*b-c]e[^]]", r
+
+
+class TestDoubleAsteriskOnIncludesExcludes(UnitTestCase):
+    """Test ** on includes and exclude patterns"""
+
+    def test_double_asterisk_include(self):
+        """Test a few globbing patterns, including **"""
+        self.assertEqual(
+            path_matches_glob_fn("**", 1)(Path("foo.txt")), 1)
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("**", 1)(Path("folder")), 1)
+
+    def test_double_asterisk_extension_include(self):
+        """Test **.py"""
+        self.assertEqual(
+            path_matches_glob_fn("**.py", 1)(Path("what/ever.py")), 1)
+        self.assertEqual(
+            path_matches_glob_fn("**.py", 1)(Path("what/ever.py/foo")), 1)
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("**.py", 1)(Path("foo")), 2)
+            self.assertEqual(
+                path_matches_glob_fn("**.py", 1)(Path("usr/local/bin")), 2)
+            self.assertEqual(
+                path_matches_glob_fn("**.py", 1)(Path("/usr/local/bin")), 2)
+
+
+class TestTrailingSlash(UnitTestCase):
+    """Test glob matching where the glob has a trailing slash"""
+
+    def test_trailing_slash_matches_only_dirs(self):
+        """Test matching where glob includes a trailing slash"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("fold*/", 1)(Path("folder")), 1)
+
+            # Test the file named "folder" is not included if it is not a dir
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("fold*/", 1)(Path("folder")), None)
+
+            # Test the file named "folder" is not included if it is not a dir
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("folder/", 1)(Path("folder")), None)
+
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("fo*/", 1)(Path("foo.txt")), None)
+
+    def test_included_files_are_matched_no_slash(self):
+        """Test that files within an included folder are matched"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("fold*", 1)(Path("folder/file.txt")), 1)
+
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("fold*", 1)(Path("folder/2/file.txt")), 1)
+
+    def test_included_files_are_matched_no_slash_2(self):
+        """Test that files within an included folder are matched"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("folder", 1)(Path("folder/file.txt")), 1)
+
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("folder/2", 1)(Path("folder/2/file.txt")), 1)
+
+    def test_included_files_are_matched_slash(self):
+        """Test that files within an included folder are matched with /"""
+        # Bug #1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("folder/", 1)(Path("folder/file.txt")), 1)
+
+    def test_included_files_are_matched_slash_2(self):
+        """Test that files within an included folder are matched with /"""
+        # Bug #1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("testfiles/select2/1/1sub1/1sub1sub1/", 1)
+                (Path("testfiles/select2/1/1sub1/1sub1sub1/1sub1sub1_file.txt")
+                 ), 1)
+
+    def test_included_files_are_matched_slash_2_parents(self):
+        """Test that duplicity will scan parent of glob/"""
+        # Bug #1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("testfiles/select2/1/1sub1/1sub1sub1/", 1)
+                (Path("testfiles/select2/1/1sub1/1sub1sub1")
+                 ), 1)
+            self.assertEqual(
+                path_matches_glob_fn("testfiles/select2/1/1sub1/1sub1sub1/", 1)
+                (Path("testfiles/select2/1/1sub1")
+                 ), 2)
+
+    def test_included_files_are_matched_slash_wildcard(self):
+        """Test that files within an included folder are matched with /"""
+        # Bug #1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = False
+            self.assertEqual(
+                path_matches_glob_fn("fold*/", 1)(Path("folder/file.txt")), 1)
+    #
+    # def test_slash_matches_everything(self):
+    #     """Test / matches everything"""
+    #     # ToDo: Not relevant at this stage, as "/" would not go through
+    #     # globmatch because it has no special characters, but it should be
+    #     # made to work
+    #     with patch('duplicity.path.Path.isdir') as mock_isdir:
+    #         mock_isdir.return_value = True
+    #         self.assertEqual(
+    #             path_matches_glob_fn("/",
+    #                                  1)(Path("/tmp/testfiles/select/1/2")), 1)
+    #         self.assertEqual(
+    #             path_matches_glob_fn("/",
+    #                                  1)(Path("/test/random/path")), 1)
+    #         self.assertEqual(
+    #             path_matches_glob_fn("/",
+    #                                  1)(Path("/")), 1)
+    #         self.assertEqual(
+    #             path_matches_glob_fn("/",
+    #                                  1)(Path("/var/log")), 1)
+    #         self.assertEqual(
+    #             path_matches_glob_fn("/",
+    #                                  1)(Path("/var/log/log.txt")), 1)
+
+    def test_slash_star_scans_folder(self):
+        """Test that folder/* scans folder/"""
+        # This behaviour is a bit ambiguous - either include or scan could be
+        # argued as most appropriate here, but only an empty folder is at stake
+        # so long as test_slash_star_includes_folder_contents passes.
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("folder/*", 1)(Path("folder")), 2)
+
+    def test_slash_star_includes_folder_contents(self):
+        """Test that folder/* includes folder contents"""
+        self.assertEqual(path_matches_glob_fn("folder/*", 1)
+                         (Path("folder/file.txt")), 1)
+        self.assertEqual(path_matches_glob_fn("folder/*", 1)
+                         (Path("folder/other_file.log")), 1)
+
+    def test_slash_star_star_includes_folder(self):
+        """Test that folder/** includes folder/"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+
+    def test_simple_trailing_slash_match(self):
+        """Test that a normal folder string ending in / matches that path"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            test_path = "testfiles/select/1/2/1"
+            self.assertEqual(
+                path_matches_glob_fn(test_path, 1)(Path(test_path)), 1)
+
+    def test_double_asterisk_string_slash(self):
+        """Test string starting with ** and ending in /"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("**/1/2/",
+                                     1)(Path("testfiles/select/1/2")), 1)
+
+    def test_string_double_asterisk_string_slash(self):
+        """Test string ** string /"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("testfiles**/2/",
+                                     1)(Path("testfiles/select/1/2")), 1)
+
+
+class TestDoubleAsterisk(UnitTestCase):
+    """Test glob matching where the glob finishes with a **"""
+
+    def test_simple_trailing_slash_match(self):
+        """Test that a folder string ending in /** matches that path"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("/test/folder/**", 1)(
+                    Path("/test/foo")), None)
+
+    def test_simple_trailing_slash_match_2(self):
+        """Test folder string ending in */**"""
+        with patch('duplicity.path.Path.isdir') as mock_isdir:
+            mock_isdir.return_value = True
+            self.assertEqual(
+                path_matches_glob_fn("fold*/**", 1)(
+                    Path("folder")), 2)
\ No newline at end of file

=== modified file 'testing/unit/test_selection.py'
--- testing/unit/test_selection.py	2016-03-05 10:14:57 +0000
+++ testing/unit/test_selection.py	2016-11-20 22:26:00 +0000
@@ -83,8 +83,6 @@
 
         with patch('duplicity.path.ROPath.isdir') as mock_isdir:
             mock_isdir.return_value = True
-            # Can't pass the return_value as an argument to patch, as build
-            # system's mock is too old to support it.
 
             assert sf2(self.makeext("usr")) is None
             assert sf2(self.makeext("usr/local")) is None
@@ -666,6 +664,36 @@
                         ("--exclude", "**t/1/3")],
                        [(), ('2',), ('2', '1')])
 
+    def test_includes_files(self):
+        """Unit test the functional test test_includes_files"""
+        # Test for Bug 1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        self.root = Path("testfiles/select2/1/1sub1")
+        self.ParseTest([("--include", "testfiles/select2/1/1sub1/1sub1sub1"),
+                        ("--exclude", "**")],
+                       [(), ('1sub1sub1',), ('1sub1sub1',
+                                             '1sub1sub1_file.txt')])
+
+    def test_includes_files_trailing_slash(self):
+        """Unit test the functional test test_includes_files_trailing_slash"""
+        # Test for Bug 1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        self.root = Path("testfiles/select2/1/1sub1")
+        self.ParseTest([("--include", "testfiles/select2/1/1sub1/1sub1sub1/"),
+                        ("--exclude", "**")],
+                       [(), ('1sub1sub1',), ('1sub1sub1',
+                                             '1sub1sub1_file.txt')])
+
+    def test_includes_files_trailing_slash_globbing_chars(self):
+        """Unit test functional test_includes_files_trailing_slash_globbing_chars"""
+        # Test for Bug 1624725
+        # https://bugs.launchpad.net/duplicity/+bug/1624725
+        self.root = Path("testfiles/select2/1/1sub1")
+        self.ParseTest([("--include", "testfiles/s?lect2/1/1sub1/1sub1sub1/"),
+                        ("--exclude", "**")],
+                       [(), ('1sub1sub1',), ('1sub1sub1',
+                                             '1sub1sub1_file.txt')])
+
     def test_glob(self):
         """Test globbing expression"""
         self.ParseTest([("--exclude", "**[3-5]"),


Follow ups