← Back to team overview

launchpad-reviewers team mailing list archive

Re: [Merge] lp:~stub/launchpad/swift-librarian into lp:launchpad

 

Review: Needs Fixing code

> > 783     + swift_connection.head_object(container, obj_name)
> > 784     + log.debug(
> > 785     + "{0} already exists in Swift({1}, {2})".format(
> > 786     + lfc, container, obj_name))
> >
> > Is it worth comparing the hash, or at least the size?
> 
> I can compare the size for non-large objects (this branch doesn't
> support large objects, but the next update will). The mock swift
> doesn't give me a hash in the headers, and I would need to calculate
> the hash on the local file myself since this method doesn't have a
> database connection around. I've updated to abort if there is a size
> mismatch for <5GB sized files. Even this is probably unnecessary,
> given the next bit of this review.
> 
> 
> > 793     + swift_connection.put_object(
> > 794     + container, obj_name,
> > 795     + open(fs_path, 'rb'), os.path.getsize(fs_path))
> >
> > I'd really like to check that Swift agrees with the hash that we have in the
> DB. The chance of some kind of undetected corruption during the conversion of
> over 14TiB of data sounds non-negligible.
> 
> I've looked into this, and found that the way we should do this is pto
> compare the etag returned from Swift with the md5 hash. <aside>This is
> all stuff the Python library should be doing :-/</aside>
> 
> This will become more complicated with large files (the returned hash
> is the hash of all the segment hashes concatenated), but that is the
> next branch. I'm using the hash of the file on disk, rather than
> whatever is stored in the db, as at the moment this code does not have
> a database connection.

I wouldn't use an assert statement here; they're skipped with python -O, and at least some bits of production run without them. Crashing here is the right thing to do, but I'd raise an AssertionError explicitly.

However, I've just realised that crashing has amusing consequences. The check in the try block only verifies the size, which will probably match. So a rerun after a crash will think the file is uploaded even though the previous run determined it was corrupt. Hopefully we'll never run into the crash, but if we do we'll need to recover it manually.

> > 1730    +class FileProducer(static.NoRangeStaticProducer):
> > 1731    + @defer.inlineCallbacks
> > 1732    + def resumeProducing(self):
> >
> > resumeProducing's callers don't expect a deferred AFAICT. Are you sure this
> actually does anything?
> 
> It doesn't return a deferred - the call site of an inlineCallbacks
> decorated method ends up with what is sent from returnValue (in this
> case, None because there isn't one). The decorator means we can use
> yield to avoid blocking calls. And test_librarian_serves_from_swift
> fails if I remove it :)

inlineCallbacks isn't *that* magical. A method (well, generator) wrapped in it returns a Deferred, and the Deferred has a return value of what you send to returnValue. I don't see a resumeProducing callsite that expects to handle a Deferred; they all just discard its return value.
-- 
https://code.launchpad.net/~stub/launchpad/swift-librarian/+merge/179142
Your team Launchpad code reviewers is subscribed to branch lp:launchpad.


References