← Back to team overview

duplicity-team team mailing list archive

[Merge] lp:~harningt/duplicity/multibackend-mirror into lp:duplicity

 

Thomas Harning has proposed merging lp:~harningt/duplicity/multibackend-mirror into lp:duplicity.

Requested reviews:
  duplicity-team (duplicity-team)
Related bugs:
  Bug #1474994 in Duplicity: "Multi backend should offer mirror option"
  https://bugs.launchpad.net/duplicity/+bug/1474994

For more details, see:
https://code.launchpad.net/~harningt/duplicity/multibackend-mirror/+merge/286090

This changeset addresses multibackend handling to permit a mirroring option in addition to its "stripe" mode to make it a redundancy tool vs space-expansion tool. To do this without changing the configuration too much, I used the query string that would generally go unused for files to specify behavior that applies to all items inside the configuration file.

Testing would include testing that multibackend behaves exactly as it did before the change when no parameters are introduced and testing each of the (stripe / mirror) x (continue / abort) options.

Locally tested on 2 machine since around July 2015 were the mode=mirror onfail=abort modes to replace some shell-scripts I used to run after backup to synchronize a local backup with many backends.
-- 
Your team duplicity-team is requested to review the proposed merge of lp:~harningt/duplicity/multibackend-mirror into lp:duplicity.
=== modified file 'bin/duplicity.1'
--- bin/duplicity.1	2016-02-05 09:58:57 +0000
+++ bin/duplicity.1	2016-02-15 17:52:54 +0000
@@ -1644,13 +1644,61 @@
 more than one backend store (e.g., you can store across a google drive
 account and a onedrive account to get effectively the combined storage
 available in both).  The URL path specifies a JSON formated config
-file containing a list of the backends it will use. Multibackend then
-round-robins across the given backends.  Each element of the list must
-have a "url" element, and may also contain an optional "description"
-and an optional "env" list of environment variables used to configure
-that backend.
-.PP
-For example:
+file containing a list of the backends it will use. The URL may also
+specify "query" parameters to configure overall behavior.
+Each element of the list must have a "url" element, and may also contain
+an optional "description" and an optional "env" list of environment
+variables used to configure that backend.
+
+.SS Query Parameters
+
+Query parameters come after the file URL in standard HTTP format
+for example:
+
+.nf
+.RS
+multi:///path/to/config.json?mode=mirror&onfail=abort
+multi:///path/to/config.json?mode=stripe&onfail=continue
+multi:///path/to/config.json?onfail=abort&mode=stripe
+multi:///path/to/config.json?onfail=abort
+.RE
+
+Order does not matter, however unrecognized parameters are considered
+an error.
+
+.TP
+.BI "mode=" stripe
+
+This mode (the default) performs round-robin access to the list of
+backends. In this mode, all backends must be reliable as a loss of one
+means a loss of one of the archive files.
+
+.TP
+.BI "mode=" mirror
+
+This mode accesses backends as a RAID1-store, storing every file in
+every backend and reading files from the first-successful backend.
+A loss of any backend should result in no failure. Note that backends
+added later will only get new files and may require a manual sync
+with one of the other operating ones.
+
+.TP
+.BI "onfail=" continue
+
+This setting (the default) continues all write operations in as
+best-effort. Any failure results in the next backend tried. Failure
+is reported only when all backends fail a given operation with the
+error result from the last failure.
+
+.TP
+.BI "onfail=" abort
+
+This setting considers any backend write failure as a terminating
+condition and reports the error.
+Data reading and listing operations are independent of this and
+will try with the next backend on failure.
+
+.SS JSON File Example
 .nf
 .RS
 [

=== modified file 'duplicity/backends/multibackend.py'
--- duplicity/backends/multibackend.py	2015-12-04 11:34:25 +0000
+++ duplicity/backends/multibackend.py	2016-02-15 17:52:54 +0000
@@ -1,6 +1,9 @@
 # -*- Mode:Python; indent-tabs-mode:nil; tab-width:4 -*-
 #
 # Copyright 2015 Steve Tynor <steve.tynor@xxxxxxxxx>
+# Copyright 2016 Thomas Harning Jr <harningt@xxxxxxxxx>
+#                  - mirror/stripe modes
+#                  - write error modes
 #
 # This file is part of duplicity.
 #
@@ -24,6 +27,7 @@
 import os.path
 import string
 import urllib
+import urlparse
 import json
 
 import duplicity.backend
@@ -37,13 +41,68 @@
     # the stores we are managing
     __stores = []
 
-    # when we write, we "stripe" via a simple round-robin across
+    # Set of known query paramaters
+    __knownQueryParameters = frozenset([
+        'mode',
+        'onfail',
+        ])
+
+    # the mode of operation to follow
+    # can be one of 'stripe' or 'mirror' currently
+    __mode = 'stripe'
+    __mode_allowedSet = frozenset([
+        'mirror',
+        'stripe',
+    ])
+
+    # the write error handling logic
+    # can be one of the following:
+    # * continue - default, on failure continues to next source
+    # * abort - stop all further operations
+    __onfail_mode = 'continue'
+    __onfail_mode_allowedSet = frozenset([
+        'abort',
+        'continue',
+    ])
+
+    # when we write in stripe mode, we "stripe" via a simple round-robin across
     # remote stores.  It's hard to get too much more sophisticated
     # since we can't rely on the backend to give us any useful meta
     # data (e.g. sizes of files, capacity of the store (quotas)) to do
     # a better job of balancing load across stores.
     __write_cursor = 0
 
+    @staticmethod
+    def get_query_params(parsed_url):
+        # Reparse so the query string is available
+        reparsed_url = urlparse.urlparse(parsed_url.geturl())
+        if len(reparsed_url.query) == 0:
+            return dict()
+        try:
+            queryMultiDict = urlparse.parse_qs(reparsed_url.query, strict_parsing = True)
+        except ValueError as e:
+            log.Log(_("MultiBackend: Could not parse query string %s: %s ")
+                    % (reparsed_url.query, e),
+                    log.ERROR)
+            raise BackendException('Could not parse query string')
+        queryDict = dict()
+        # Convert the multi-dict to a single dictionary
+        # while checking to make sure that no unrecognized values are found
+        for name, valueList in queryMultiDict.items():
+            if len(valueList) != 1:
+                log.Log(_("MultiBackend: Invalid query string %s: more than one value for %s")
+                        % (reparsed_url.query, name),
+                        log.ERROR)
+                raise BackendException('Invalid query string')
+            if name not in MultiBackend.__knownQueryParameters:
+                log.Log(_("MultiBackend: Invalid query string %s: unknown parameter %s")
+                        % (reparsed_url.query, name),
+                        log.ERROR)
+                raise BackendException('Invalid query string')
+
+            queryDict[name] = valueList[0]
+        return queryDict
+
     def __init__(self, parsed_url):
         duplicity.backend.Backend.__init__(self, parsed_url)
 
@@ -77,10 +136,32 @@
         #  }
         # ]
 
+        queryParams = MultiBackend.get_query_params(parsed_url)
+
+        if 'mode' in queryParams:
+            self.__mode = queryParams['mode']
+
+        if 'onfail' in queryParams:
+            self.__onfail_mode = queryParams['onfail']
+
+        if not self.__mode in MultiBackend.__mode_allowedSet:
+            log.Log(_("MultiBackend: illegal value for %s: %s")
+                    % ('mode', self.__mode), log.ERROR)
+            raise BackendException("MultiBackend: invalid mode value")
+
+        if not self.__onfail_mode in MultiBackend.__onfail_mode_allowedSet:
+            log.Log(_("MultiBackend: illegal value for %s: %s")
+                    % ('onfail', self.__onfail_mode), log.ERROR)
+            raise BackendException("MultiBackend: invalid onfail value")
+
         try:
             with open(parsed_url.path) as f:
                 configs = json.load(f)
         except IOError as e:
+            log.Log(_("MultiBackend: Url %s")
+                    % (parsed_url.geturl()),
+                    log.ERROR)
+
             log.Log(_("MultiBackend: Could not load config file %s: %s ")
                     % (parsed_url.path, e),
                     log.ERROR)
@@ -88,6 +169,8 @@
 
         for config in configs:
             url = config['url']
+            # Fix advised in bug #1471795
+            url = url.encode('utf-8')
             log.Log(_("MultiBackend: use store %s")
                     % (url),
                     log.INFO)
@@ -106,6 +189,12 @@
             #         log.INFO)
 
     def _put(self, source_path, remote_filename):
+        # Store an indication of whether any of these passed
+        passed = False
+        # Mirror mode always starts at zero
+        if self.__mode == 'mirror':
+            self.__write_cursor = 0
+
         first = self.__write_cursor
         while True:
             store = self.__stores[self.__write_cursor]
@@ -117,15 +206,29 @@
                         % (self.__write_cursor, store.backend.parsed_url.url_string),
                         log.DEBUG)
                 store.put(source_path, remote_filename)
+                passed = True
                 self.__write_cursor = next
-                break
+                # No matter what, if we loop around, break this loop
+                if next == 0:
+                    break
+                # If in stripe mode, don't continue to the next
+                if self.__mode == 'stripe':
+                    break
             except Exception as e:
                 log.Log(_("MultiBackend: failed to write to store #%s (%s), try #%s, Exception: %s")
                         % (self.__write_cursor, store.backend.parsed_url.url_string, next, e),
                         log.INFO)
                 self.__write_cursor = next
 
-                if (self.__write_cursor == first):
+                # If we consider write failure as abort, abort
+                if self.__onfail_mode == 'abort':
+                    log.Log(_("MultiBackend: failed to write %s. Aborting process.")
+                            % (source_path),
+                            log.ERROR)
+                    raise BackendException("failed to write")
+
+                # If we've looped around, and none of them passed, fail
+                if (self.__write_cursor == first) and not passed:
                     log.Log(_("MultiBackend: failed to write %s. Tried all backing stores and none succeeded")
                             % (source_path),
                             log.ERROR)
@@ -159,14 +262,16 @@
                     % (s.backend.parsed_url.url_string, l),
                     log.DEBUG)
             lists.append(s.list())
-        # combine the lists into a single flat list:
-        result = [item for sublist in lists for item in sublist]
+        # combine the lists into a single flat list w/o duplicates via set:
+        result = list({ item for sublist in lists for item in sublist })
         log.Log(_("MultiBackend: combined list: %s")
                 % (result),
                 log.DEBUG)
         return result
 
     def _delete(self, filename):
+        # Store an indication on whether any passed
+        passed = False
         # since the backend operations will be retried, we can't
         # simply try to get from the store, if not found, move to the
         # next store (since each failure will be retried n times
@@ -177,13 +282,17 @@
             list = s.list()
             if filename in list:
                 s._do_delete(filename)
-                return
+                passed = True
+                # In stripe mode, only one item will have the file
+                if self.__mode == 'stripe':
+                    return
             log.Log(_("MultiBackend: failed to delete %s from %s")
                     % (filename, s.backend.parsed_url.url_string),
                     log.INFO)
-        log.Log(_("MultiBackend: failed to delete %s. Tried all backing stores and none succeeded")
-                % (filename),
-                log.ERROR)
-#        raise BackendException("failed to delete")
+        if not passed:
+            log.Log(_("MultiBackend: failed to delete %s. Tried all backing stores and none succeeded")
+                    % (filename),
+                    log.ERROR)
+#           raise BackendException("failed to delete")
 
 duplicity.backend.register_backend('multi', MultiBackend)