← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] ~cjwatson/launchpad:py3-sqlbase-hash into launchpad:master

 

Colin Watson has proposed merging ~cjwatson/launchpad:py3-sqlbase-hash into launchpad:master.

Commit message:
Define SQLBase.__hash__

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/394941

On Python 2, SQLBase objects were hashable without this, but they broke the contract for __hash__, because objects from different stores with the same class and id would compare equal but have different hash values.

Python 3 made it more difficult to make this kind of mistake, by requiring that objects that define __eq__ must also define __hash__ in order to be hashable, so now we do so.  However, the case of newly-created objects where we haven't yet got a sequence value back from the database requires some care.  We can no longer work around this using object identity as was done by SQLBase.__eq__, since that would lead to the object's hash value changing once its id is known.  Instead, we flush the store if we don't yet know the object's id; this is perhaps slightly surprising in a special method, but it seems the least bad approach.

To minimise confusion, also rework SQLBase.__eq__ to use the same flush-the-store-if-necessary approach rather than checking object identity.

This should fix a large number of tests on Python 3, which were typically failing with something like "TypeError: unhashable type: 'Person'".  I'd thought that it might only be possible to do this by converting everything to Storm, but recently worked out how to fix it directly.  Having worked this out, it may also be possible to add similar __eq__ and __hash__ methods to StormBase in a later branch.
-- 
Your team Launchpad code reviewers is requested to review the proposed merge of ~cjwatson/launchpad:py3-sqlbase-hash into launchpad:master.
diff --git a/lib/lp/services/database/sqlbase.py b/lib/lp/services/database/sqlbase.py
index 57fd27c..3942ba7 100644
--- a/lib/lp/services/database/sqlbase.py
+++ b/lib/lp/services/database/sqlbase.py
@@ -233,27 +233,51 @@ class SQLBase(storm.sqlobject.SQLObjectBase):
     def __eq__(self, other):
         """Equality operator.
 
-        Objects compare equal if:
-            - They are the same instance, or
-            - They have the same class and id, and the id is not None.
-
-        These rules allows objects retrieved from different stores to
-        compare equal. The 'is' comparison is to support newly created
-        objects that don't yet have an id (and by definition only exist
-        in the Master store).
+        Objects compare equal if they have the same class and id, and the id
+        is not None.
+
+        This rule allows objects retrieved from different stores to compare
+        equal.  Newly-created objects may not yet have an id; in such cases
+        we flush the store so that we can find out their id.
         """
         naked_self = removeSecurityProxy(self)
         naked_other = removeSecurityProxy(other)
-        return (
-            (naked_self is naked_other)
-            or (naked_self.__class__ == naked_other.__class__
-                and naked_self.id is not None
-                and naked_self.id == naked_other.id))
+        if naked_self.__class__ != naked_other.__class__:
+            return False
+        try:
+            self_id = naked_self.id
+        except KeyError:
+            self.syncUpdate()
+            self_id = naked_self.id
+        if self_id is None:
+            return False
+        try:
+            other_id = naked_other.id
+        except KeyError:
+            other.syncUpdate()
+            other_id = naked_other.id
+        return self_id == other_id
 
     def __ne__(self, other):
         """Inverse of __eq__."""
         return not (self == other)
 
+    def __hash__(self):
+        """Hash operator.
+
+        We must define __hash__ since we define __eq__ (Python 3 requires
+        this), but we need to take care to preserve the invariant that
+        objects that compare equal have the same hash value.  Newly-created
+        objects may not yet have an id; in such cases we flush the store so
+        that we can find out their id.
+        """
+        try:
+            id = self.id
+        except KeyError:
+            self.syncUpdate()
+            id = self.id
+        return hash((self.__class__, id))
+
     def __storm_invalidated__(self):
         """Flush cached properties."""
         # XXX: RobertCollins 2010-08-16 bug=622648: Note this is not directly
diff --git a/lib/lp/services/database/tests/test_bulk.py b/lib/lp/services/database/tests/test_bulk.py
index 76b0471..1c93654 100644
--- a/lib/lp/services/database/tests/test_bulk.py
+++ b/lib/lp/services/database/tests/test_bulk.py
@@ -135,19 +135,23 @@ class TestLoaders(TestCaseWithFactory):
         # Commit so the database object is available in both master
         # and slave stores.
         transaction.commit()
-        db_objects = set(
-            (IMasterStore(db_object).get(db_object_type, db_object.id),
-             ISlaveStore(db_object).get(db_object_type, db_object.id)))
+        # Use a list, since objects corresponding to the same DB row from
+        # different stores compare equal.
+        db_objects = [
+            IMasterStore(db_object).get(db_object_type, db_object.id),
+            ISlaveStore(db_object).get(db_object_type, db_object.id),
+            ]
+        db_object_ids = {id(obj) for obj in db_objects}
         db_queries = list(bulk.gen_reload_queries(db_objects))
         self.assertEqual(2, len(db_queries))
-        db_objects_loaded = set()
+        db_object_ids_loaded = set()
         for db_query in db_queries:
-            objects = set(db_query)
+            object_ids = {id(obj) for obj in db_query}
             # None of these objects should have been loaded before.
             self.assertEqual(
-                set(), objects.intersection(db_objects_loaded))
-            db_objects_loaded.update(objects)
-        self.assertEqual(db_objects, db_objects_loaded)
+                set(), object_ids.intersection(db_object_ids_loaded))
+            db_object_ids_loaded.update(object_ids)
+        self.assertEqual(db_object_ids, db_object_ids_loaded)
 
     def test_gen_reload_queries_with_non_Storm_objects(self):
         # gen_reload_queries() does not like non-Storm objects.