← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~wgrant/launchpad/bug-1075767 into lp:launchpad

 

William Grant has proposed merging lp:~wgrant/launchpad/bug-1075767 into lp:launchpad.

Commit message:
Optimise PersonSet person visibility checks, and simplify person vocab queries.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #1075767 in Launchpad itself: "timeout PersonSet:CollectionResource:#people:findTeam"
  https://bugs.launchpad.net/launchpad/+bug/1075767

For more details, see:
https://code.launchpad.net/~wgrant/launchpad/bug-1075767/+merge/136572

This branch fixes and cleans various bits and pieces around person search, particularly with private teams.

I went in to fix bug #1075767 by changing PersonSet's search access checks to use Person.id IN ([... subquery for user's participated teams]), which is hashable and much faster than the existing TeamParticipation join. This was a resounding success. But then I opted to see if the benefits would extend to ValidPersonOrTeamVocabulary, and so the trouble started.

ValidPersonOrTeamVocabulary and its derivatives used an alternate technique to avoid the same problem: find public and private objects in separate queries, and union them together. This ends up pretty hideous, and slightly slower than the much cleaner and shorter technique I used in PersonSet. So I went about reworking the vocabs to use the new method, and even share the privacy filter code with PersonSet.

In doing this I discovered that PersonSet searches didn't respect the admins-and-commercial-admins-take-all rule, so I implemented that in what is now the single implementation of person privacy querying. (Commercial) admins should now be able to find private teams on /people even if they're not a member.

With the vocabs' hideousness reduced, the branch ended up around -150.
-- 
https://code.launchpad.net/~wgrant/launchpad/bug-1075767/+merge/136572
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wgrant/launchpad/bug-1075767 into lp:launchpad.
=== modified file 'lib/lp/registry/model/person.py'
--- lib/lp/registry/model/person.py	2012-11-26 08:33:03 +0000
+++ lib/lp/registry/model/person.py	2012-11-28 06:06:23 +0000
@@ -6,6 +6,7 @@
 __metaclass__ = type
 __all__ = [
     'AlreadyConvertedException',
+    'get_person_visibility_terms',
     'get_recipients',
     'generate_nick',
     'IrcID',
@@ -394,6 +395,32 @@
     return value
 
 
+def get_person_visibility_terms(user):
+    """Generate the query needed for person privacy filtering."""
+    public_filter = (Person.visibility == PersonVisibility.PUBLIC)
+
+    # Anonymous users can only see public people.
+    if user is None:
+        return public_filter
+
+    # Admins and commercial admins can see everyone.
+    roles = IPersonRoles(user)
+    if roles.in_admin or roles.in_commercial_admin:
+        return True
+
+    # Otherwise only public people and private teams of which the user
+    # is a member are visible.
+    return Or(
+        public_filter,
+        And(
+            Person.id.is_in(
+                Select(
+                    TeamParticipation.teamID, tables=[TeamParticipation],
+                    where=(TeamParticipation.person == user))),
+            Person.teamowner != None,
+            Person.visibility != PersonVisibility.PUBLIC))
+
+
 _person_sort_re = re.compile("(?:[^\w\s]|[\d_])", re.U)
 
 
@@ -3535,42 +3562,11 @@
         """See `IPersonSet`."""
         return getUtility(ILaunchpadStatisticSet).value('teams_count')
 
-    def _teamPrivacyQuery(self):
-        """Generate the query needed for privacy filtering.
-
-        If the visibility is not PUBLIC ensure the logged in user is a member
-        of the team.
-        """
-        logged_in_user = getUtility(ILaunchBag).user
-        if logged_in_user is not None:
-            private_query = SQL("""
-                TeamParticipation.person = ?
-                AND Person.teamowner IS NOT NULL
-                AND Person.visibility != ?
-                """, (logged_in_user.id, PersonVisibility.PUBLIC.value))
-        else:
-            private_query = None
-
-        base_query = SQL("Person.visibility = ?",
-                         (PersonVisibility.PUBLIC.value, ),
-                         tables=['Person'])
-
-        if private_query is None:
-            query = base_query
-        else:
-            query = Or(base_query, private_query)
-
-        return query
-
     def _teamEmailQuery(self, text):
         """Product the query for team email addresses."""
-        privacy_query = self._teamPrivacyQuery()
-        # XXX: BradCrittenden 2009-06-08 bug=244768:  Use Not(Bar.foo == None)
-        # instead of Bar.foo != None.
         team_email_query = And(
-            privacy_query,
-            TeamParticipation.team == Person.id,
-            Not(Person.teamowner == None),
+            get_person_visibility_terms(getUtility(ILaunchBag).user),
+            Person.teamowner != None,
             Person.merged == None,
             EmailAddress.person == Person.id,
             EmailAddress.email.lower().startswith(ensure_unicode(text)))
@@ -3578,13 +3574,9 @@
 
     def _teamNameQuery(self, text):
         """Produce the query for team names."""
-        privacy_query = self._teamPrivacyQuery()
-        # XXX: BradCrittenden 2009-06-08 bug=244768:  Use Not(Bar.foo == None)
-        # instead of Bar.foo != None.
         team_name_query = And(
-            privacy_query,
-            TeamParticipation.team == Person.id,
-            Not(Person.teamowner == None),
+            get_person_visibility_terms(getUtility(ILaunchBag).user),
+            Person.teamowner != None,
             Person.merged == None,
             SQL("Person.fti @@ ftq(?)", (text, )))
         return team_name_query

=== modified file 'lib/lp/registry/vocabularies.py'
--- lib/lp/registry/vocabularies.py	2012-11-24 17:49:30 +0000
+++ lib/lp/registry/vocabularies.py	2012-11-28 06:06:23 +0000
@@ -133,7 +133,6 @@
     )
 from lp.registry.interfaces.productseries import IProductSeries
 from lp.registry.interfaces.projectgroup import IProjectGroup
-from lp.registry.interfaces.role import IPersonRoles
 from lp.registry.interfaces.sourcepackage import ISourcePackage
 from lp.registry.model.distribution import Distribution
 from lp.registry.model.distroseries import DistroSeries
@@ -144,6 +143,7 @@
 from lp.registry.model.mailinglist import MailingList
 from lp.registry.model.milestone import Milestone
 from lp.registry.model.person import (
+    get_person_visibility_terms,
     IrcID,
     Person,
     )
@@ -567,37 +567,8 @@
             return None
         return '%s = %d' % (karma_context_column, context.id)
 
-    def _privateTeamQueryAndTables(self):
-        """Return query tables for private teams.
-
-        The teams are based on membership by the user.
-        Returns a tuple of (query, tables).
-        """
-        tables = []
-        logged_in_user = getUtility(ILaunchBag).user
-        if logged_in_user is not None:
-            roles = IPersonRoles(logged_in_user)
-            if roles.in_admin or roles.in_commercial_admin:
-                # If the user is a LP admin or commercial admin we allow
-                # all private teams to be visible.
-                private_query = AND(
-                    Not(Person.teamowner == None),
-                    Person.visibility == PersonVisibility.PRIVATE)
-            else:
-                private_query = AND(
-                    TeamParticipation.person == logged_in_user.id,
-                    Not(Person.teamowner == None),
-                    Person.visibility == PersonVisibility.PRIVATE)
-                tables = [Join(TeamParticipation,
-                               TeamParticipation.teamID == Person.id)]
-        else:
-            private_query = False
-        return (private_query, tables)
-
     def _doSearch(self, text="", vocab_filter=None):
         """Return the people/teams whose fti or email address match :text:"""
-
-        private_query, private_tables = self._privateTeamQueryAndTables()
         extra_clauses = [self.extra_clause]
         if vocab_filter:
             extra_clauses.extend(vocab_filter.filter_terms)
@@ -610,33 +581,17 @@
                 Join(self.cache_table_name,
                      SQL("%s.id = Person.id" % self.cache_table_name)),
                 ]
-            tables.extend(private_tables)
             tables.extend(self.extra_tables)
             result = self.store.using(*tables).find(
                 Person,
-                And(
-                    Or(Person.visibility == PersonVisibility.PUBLIC,
-                       private_query,
-                       ),
-                    Person.merged == None,
-                    *extra_clauses
-                    )
-                )
-            result.config(distinct=True)
+                get_person_visibility_terms(getUtility(ILaunchBag).user),
+                Person.merged == None,
+                *extra_clauses)
             result.order_by(Person.displayname, Person.name)
         else:
             # Do a full search based on the text given.
 
-            # The queries are broken up into several steps for efficiency.
-            # The public person and team searches do not need to join with the
-            # TeamParticipation table, which is very expensive.  The search
-            # for private teams does need that table but the number of private
-            # teams is very small so the cost is not great. However, if the
-            # person is a logged in administrator, we don't need to join to
-            # the TeamParticipation table and can construct a more efficient
-            # query (since in this case we are searching all private teams).
-
-            # Create a query that will match public persons and teams that
+            # Create a query that will match persons and teams that
             # have the search text in the fti, at the start of their email
             # address, as their full IRC nickname, or at the start of their
             # displayname.
@@ -651,7 +606,7 @@
             # This is the SQL that will give us the IDs of the people we want
             # in the result.
             matching_person_sql = SQL("""
-                SELECT id, MAX(rank) AS rank, false as is_private_team
+                SELECT id, MAX(rank) AS rank
                 FROM (
                     SELECT Person.id,
                     (case
@@ -677,84 +632,39 @@
                         AND LOWER(EmailAddress.email) LIKE lower(?) || '%%'
                         AND status IN (?, ?)
                 ) AS person_match
-                GROUP BY id, is_private_team
+                GROUP BY id
             """, (text, text, text, text, text, text, text, text, text,
                   EmailAddressStatus.VALIDATED.value,
                   EmailAddressStatus.PREFERRED.value))
 
-            # Do we need to search for private teams.
-            if private_tables:
-                private_tables = [Person] + private_tables
-                private_ranking_sql = SQL("""
-                    (case
-                        when person.name=lower(?) then 100
-                        when person.name like lower(?) || '%%' then 0.6
-                        when lower(person.displayname) like lower(?)
-                            || '%%' then 0.5
-                        else rank(fti, ftq(?))
-                    end) as rank
-                """, (text, text, text, text))
-
-                # Searching for private teams that match can be easier since
-                # we are only interested in teams.  Teams can have email
-                # addresses but we're electing to ignore them here.
-                private_result_select = Select(
-                    tables=private_tables,
-                    columns=(Person.id, private_ranking_sql,
-                                SQL("true as is_private_team")),
-                    where=And(
-                        SQL("""
-                            Person.name LIKE lower(?) || '%%'
-                            OR lower(Person.displayname) LIKE lower(?) || '%%'
-                            OR Person.fti @@ ftq(?)
-                            """, [text, text, text]),
-                        private_query))
-                matching_person_sql = Union(matching_person_sql,
-                          private_result_select, all=True)
-
-            # The tables for public persons and teams that match the text.
-            public_tables = [
+            # The tables for persons and teams that match the text.
+            tables = [
                 SQL("MatchingPerson"),
                 Person,
                 LeftJoin(EmailAddress, EmailAddress.person == Person.id),
                 LeftJoin(Account, Account.id == Person.accountID),
                 ]
-            public_tables.extend(self.extra_tables)
-
-            # If private_tables is empty, we are searching for all private
-            # teams. We can simply append the private query component to the
-            # public query. Otherwise, for efficiency as stated earlier, we
-            # need to do a separate query to join to the TeamParticipation
-            # table.
-            private_teams_query = private_query
-            if private_tables:
-                private_teams_query = SQL("is_private_team")
-
+            tables.extend(self.extra_tables)
+
+            # Find all the matching unmerged persons and teams. Persons
+            # must additionally have an active account and a preferred
+            # email address.
             # We just select the required ids since we will use
             # IPersonSet.getPrecachedPersonsFromIDs to load the results
             matching_with = With("MatchingPerson", matching_person_sql)
             result = self.store.with_(
-                matching_with).using(*public_tables).find(
+                matching_with).using(*tables).find(
                 Person,
-                And(
-                    SQL("Person.id = MatchingPerson.id"),
-                    Or(
+                SQL("Person.id = MatchingPerson.id"),
+                Person.merged == None,
+                Or(
+                    And(
                         Account.status == AccountStatus.ACTIVE,
-                        Person.teamowner != None),
-                    Or(
-                        And(  # A public person or team
-                            Person.visibility == PersonVisibility.PUBLIC,
-                            Person.merged == None,
-                            Or(  # A valid person-or-team is either a team...
-                                # Note: 'Not' due to Bug 244768.
-                                Not(Person.teamowner == None),
-                                # Or a person who has preferred email address.
-                                EmailAddress.status ==
-                                    EmailAddressStatus.PREFERRED)),
-                        # Or a private team
-                        private_teams_query),
-                    *extra_clauses),
-                )
+                        EmailAddress.status ==
+                            EmailAddressStatus.PREFERRED),
+                    Person.teamowner != None),
+                get_person_visibility_terms(getUtility(ILaunchBag).user),
+                *extra_clauses)
             # Better ranked matches go first.
             if self._karma_context_constraint:
                 rank_order = SQL("""
@@ -772,13 +682,10 @@
         # We will be displaying the person's irc nick(s) and emails in the
         # description so we need to bulk load them for performance, otherwise
         # we get one query per person per attribute.
-        def pre_iter_hook(rows):
-            persons = set(obj for obj in rows)
-            # The emails.
-            emails = bulk.load_referencing(
-                EmailAddress, persons, ['personID'])
-            email_by_person = dict((email.personID, email)
-                for email in emails
+        def pre_iter_hook(persons):
+            emails = bulk.load_referencing(EmailAddress, persons, ['personID'])
+            email_by_person = dict(
+                (email.personID, email) for email in emails
                 if email.status == EmailAddressStatus.PREFERRED)
 
             for person in persons:
@@ -786,9 +693,7 @@
                 cache.preferredemail = email_by_person.get(person.id, None)
                 cache.ircnicknames = []
 
-            # The irc nicks.
-            nicks = bulk.load_referencing(IrcID, persons, ['personID'])
-            for nick in nicks:
+            for nick in bulk.load_referencing(IrcID, persons, ['personID']):
                 get_property_cache(nick.person).ircnicknames.append(nick)
 
         return DecoratedResultSet(result, pre_iter_hook=pre_iter_hook)
@@ -822,60 +727,10 @@
     # Because the base class does almost everything we need, we just need to
     # restrict the search results to those Persons who have a non-NULL
     # teamowner, i.e. a valid team.
-    extra_clause = Not(Person.teamowner == None)
+    extra_clause = Person.teamowner != None
     # Search with empty string returns all teams.
     allow_null_search = True
 
-    def _doSearch(self, text="", vocab_filter=None):
-        """Return the teams whose fti, IRC, or email address match :text:"""
-
-        private_query, private_tables = self._privateTeamQueryAndTables()
-        base_query = And(
-            Or(
-                Person.visibility == PersonVisibility.PUBLIC,
-                private_query,
-                ),
-            Person.merged == None
-            )
-
-        tables = [Person] + private_tables + list(self.extra_tables)
-
-        if not text:
-            query = And(base_query,
-                        Person.merged == None,
-                        self.extra_clause)
-            result = self.store.using(*tables).find(Person, query)
-        else:
-            name_match_query = SQL("""
-                Person.name LIKE lower(?) || '%%'
-                OR lower(Person.displayname) LIKE lower(?) || '%%'
-                OR Person.fti @@ ftq(?)
-                """, [text, text, text]),
-
-            email_storm_query = self.store.find(
-                EmailAddress.personID,
-                EmailAddress.status.is_in(VALID_EMAIL_STATUSES),
-                EmailAddress.email.lower().startswith(text))
-            email_subquery = Alias(email_storm_query._get_select(),
-                                   'EmailAddress')
-            tables += [
-                LeftJoin(email_subquery, EmailAddress.person == Person.id),
-                ]
-
-            result = self.store.using(*tables).find(
-                Person,
-                And(base_query,
-                    self.extra_clause,
-                    Or(name_match_query,
-                       EmailAddress.person != None)))
-
-        # To get the correct results we need to do distinct first, then order
-        # by, then limit.
-        result.config(distinct=True)
-        result.order_by(Person.displayname, Person.name)
-        result.config(limit=self.LIMIT)
-        return result
-
     def supportedFilters(self):
         return []
 
@@ -885,7 +740,7 @@
     displayname = 'Select a Person'
     # The extra_clause for a valid person is that it not be a team, so
     # teamowner IS NULL.
-    extra_clause = 'Person.teamowner IS NULL'
+    extra_clause = Person.teamowner == None
     # Search with empty string returns all valid people.
     allow_null_search = True
     # Cache table to use for checking validity.


Follow ups