← Back to team overview

launchpad-reviewers team mailing list archive

Re: [Merge] lp:~stub/launchpad/swift-librarian into lp:launchpad

 

Review: Needs Fixing code

478	+ container = swift.SWIFT_CONTAINER_PREFIX + str(container_num)
479	+ names = sorted(
480	+ swift_connection.get_container(
481	+ container, full_listing=True)[1],
482	+ key=lambda x: int(x['name']))

This loop will terminate cleanly at the first missing block of 1 million files, but we have gaps of 16 million due to cross-volume symlink dances.

728	+ # Walk the Librarian on disk file store, searching for matching
729	+ # files that may need to be copied into Swift. We need to follow
730	+ # symlinks as they are being used span disk partitions.
731	+ for dirpath, dirnames, filenames in os.walk(fs_root, followlinks=True):

Is it worth doing the walk rather than just checking whether each file exists by brute force?

749	+ # Skip files which have been modified recently, as they
750	+ # may be uploads still in progress.
751	+ if os.path.getmtime(fs_path) > time.time() - (60 * 60):

librariangc uses 24 hours. Might be worth being consistent.

776	+ except swiftclient.ClientException as x:
777	+ if x.http_status != 404:
778	+ raise

A 404 isn't a separate exception class?

783	+ swift_connection.head_object(container, obj_name)
784	+ log.debug(
785	+ "{0} already exists in Swift({1}, {2})".format(
786	+ lfc, container, obj_name))

Is it worth comparing the hash, or at least the size?

793	+ swift_connection.put_object(
794	+ container, obj_name,
795	+ open(fs_path, 'rb'), os.path.getsize(fs_path))

I'd really like to check that Swift agrees with the hash that we have in the DB. The chance of some kind of undetected corruption during the conversion of over 14TiB of data sounds non-negligible.

813	+ max_objects_per_container = 1000000
814	+
815	+ container_num = lfc_id // max_objects_per_container
816	+
817	+ return (SWIFT_CONTAINER_PREFIX + str(container_num), str(lfc_id))

All current storage is hex based, but I guess moving to decimal isn't terrible.

887	+class ConnectionPool:
888	+ MAX_POOL_SIZE = 10

This seems like it should be a fairly common pattern. There's no common implementation used by other swiftclient consumers?

915	+ def _new_connection(self):
916	+ return swiftclient.Connection(
917	+ authurl=os.environ.get('OS_AUTH_URL', None),
918	+ user=os.environ.get('OS_USERNAME', None),
919	+ key=os.environ.get('OS_PASSWORD', None),
920	+ tenant_name=os.environ.get('OS_TENANT_NAME', None),
921	+ auth_version='2.0',
922	+ )

Should these settings be read from a config file?

1595	+ swift_enabled = getFeatureFlag('librarian.swift.enabled') or False
1596	+ if swift_enabled:

We'd normally just say "if getFeatureFlag('librarian.swift.enabled'):".

1614	+ @write_transaction
1615	+ def _getFileAlias_swift(self, aliasID, token, path):

This is identical to _getFileAlias, except that it also returns filesize.

1631	+ @defer.inlineCallbacks
1632	+ def _cb_getFileAlias_swift(
1633	+ self,
1634	+ (dbcontentID, dbfilename, mimetype, date_created, size,
1635	+ restricted),
1636	+ filename, request
1637	+ ):

Most of this method is suspiciously identical to _cb_getFileAlias.

1675	+class File_swift(static.File):

That's no class name.

1678	+ def __init__(
1679	+ self, contentType, encoding, modification_time, stream, size):

This is almost identical to File.__init__.

1713	+ def makeProducer(self, request, fileForReading):
1714	+ # Unfortunately, by overriding the static.File's more
1715	+ # complex makeProducer method we lose HTTP range support.

This seems reasonable. I assume FileProducer will just ignore the Range bit of the request and return the whole thing.

1730	+class FileProducer(static.NoRangeStaticProducer):
1731	+ @defer.inlineCallbacks
1732	+ def resumeProducing(self):

resumeProducing's callers don't expect a deferred AFAICT. Are you sure this actually does anything?

1748	=== added file 'lib/lp/services/twistedsupport/features.py'
1749	--- lib/lp/services/twistedsupport/features.py 1970-01-01 00:00:00 +0000
1750	+++ lib/lp/services/twistedsupport/features.py 2013-09-23 13:55:36 +0000

Did you consider implementing this as a LoopingCall?
-- 
https://code.launchpad.net/~stub/launchpad/swift-librarian/+merge/179142
Your team Launchpad code reviewers is subscribed to branch lp:launchpad.


Follow ups

References