← Back to team overview

mahara-contributors team mailing list archive

[Bug 1635441] Re: Profile artefacts sometimes out of sync with "usr" table fields

 

Another possibility, as a kind of bandage over the problem, would be to
set up a cron job that periodically finds and fixes out-of-sync records.
But we'd have to decide which is canonical: the usr table, or the
artefact table?

** Changed in: mahara
       Status: New => Confirmed

** Description changed:

  There are five profile fields, that are replicated into columns in the
  usr table, and artefacts in the artefact table. On the PHP side of
  things four of them use the "ArtefactTypeCachedProfileField" class:
  
  1. firstname
  2. lastname
  3. preferredname
  4. studentid
  
  The fifth one is the "email" field, which is a little bit different
  because a user can have multiple email addresses. Their "primary" email
- address (the one we actually send emails to) is stored in the usr.email
- field. We also have an additional table,
+ address (the one we actually send emails to) is replicated in the
+ usr.email field. We also have an additional table,
  "artefact_internal_profile_email", which has additional data about each
  email address.
  
  The problem is, in existing databases some of this data is out of sync.
  Sometimes we find bugs that have caused this (for instance Bug 1630753
  and Bug 1630764). Other times, we can't replicate the issue at all.
  
  We could resolve this issue by normalizing the data. That is, keep it in
  only *one* place in the database, either the usr table or the artefact
  table. It would probably be easier to "de-artefact" them. These fields
  are all frequently accessed by, for instance, the display_name()
  function. Removing them from the "usr" table would require us to run
  database joins against the artefact table, which is often huge. In
  comparison, removing them as artefacts would just complicate the
  Elasticsearch plugin and the User Profile block a little bit.

** Description changed:

  There are five profile fields, that are replicated into columns in the
  usr table, and artefacts in the artefact table. On the PHP side of
  things four of them use the "ArtefactTypeCachedProfileField" class:
  
  1. firstname
  2. lastname
  3. preferredname
  4. studentid
  
  The fifth one is the "email" field, which is a little bit different
- because a user can have multiple email addresses. Their "primary" email
- address (the one we actually send emails to) is replicated in the
- usr.email field. We also have an additional table,
- "artefact_internal_profile_email", which has additional data about each
- email address.
+ because a user can have multiple email addresses. Each of these is
+ stored as an artefact, and their "primary" email address (the one we
+ actually send emails to) is also replicated in the usr.email field. We
+ also have an additional table, "artefact_internal_profile_email", which
+ has additional data about each email address.
  
  The problem is, in existing databases some of this data is out of sync.
  Sometimes we find bugs that have caused this (for instance Bug 1630753
  and Bug 1630764). Other times, we can't replicate the issue at all.
  
  We could resolve this issue by normalizing the data. That is, keep it in
  only *one* place in the database, either the usr table or the artefact
  table. It would probably be easier to "de-artefact" them. These fields
  are all frequently accessed by, for instance, the display_name()
  function. Removing them from the "usr" table would require us to run
  database joins against the artefact table, which is often huge. In
  comparison, removing them as artefacts would just complicate the
  Elasticsearch plugin and the User Profile block a little bit.

** Description changed:

  There are five profile fields, that are replicated into columns in the
  usr table, and artefacts in the artefact table. On the PHP side of
  things four of them use the "ArtefactTypeCachedProfileField" class:
  
  1. firstname
  2. lastname
  3. preferredname
  4. studentid
  
  The fifth one is the "email" field, which is a little bit different
  because a user can have multiple email addresses. Each of these is
  stored as an artefact, and their "primary" email address (the one we
  actually send emails to) is also replicated in the usr.email field. We
  also have an additional table, "artefact_internal_profile_email", which
  has additional data about each email address.
  
  The problem is, in existing databases some of this data is out of sync.
  Sometimes we find bugs that have caused this (for instance Bug 1630753
  and Bug 1630764). Other times, we can't replicate the issue at all.
  
  We could resolve this issue by normalizing the data. That is, keep it in
  only *one* place in the database, either the usr table or the artefact
- table. It would probably be easier to "de-artefact" them. These fields
- are all frequently accessed by, for instance, the display_name()
- function. Removing them from the "usr" table would require us to run
- database joins against the artefact table, which is often huge. In
- comparison, removing them as artefacts would just complicate the
- Elasticsearch plugin and the User Profile block a little bit.
+ table. For performance reasons, it would probably be better to "de-
+ artefact" them, because these fields are all frequently accessed by, for
+ instance, the display_name() function. Removing them from the "usr"
+ table would require us to run database joins against the artefact table,
+ which is often huge and could slow things down. In comparison, removing
+ them as artefacts would just complicate the Elasticsearch plugin and the
+ User Profile block a little bit.

-- 
You received this bug notification because you are a member of Mahara
Contributors, which is subscribed to Mahara.
Matching subscriptions: Subscription for all Mahara Contributors -- please ask on #mahara-dev or mahara.org forum before editing or unsubscribing it!
https://bugs.launchpad.net/bugs/1635441

Title:
  Profile artefacts sometimes out of sync with "usr" table fields

Status in Mahara:
  Confirmed

Bug description:
  There are five profile fields, that are replicated into columns in the
  usr table, and artefacts in the artefact table. On the PHP side of
  things four of them use the "ArtefactTypeCachedProfileField" class:

  1. firstname
  2. lastname
  3. preferredname
  4. studentid

  The fifth one is the "email" field, which is a little bit different
  because a user can have multiple email addresses. Each of these is
  stored as an artefact, and their "primary" email address (the one we
  actually send emails to) is also replicated in the usr.email field. We
  also have an additional table, "artefact_internal_profile_email",
  which has additional data about each email address.

  The problem is, in existing databases some of this data is out of
  sync. Sometimes we find bugs that have caused this (for instance Bug
  1630753 and Bug 1630764). Other times, we can't replicate the issue at
  all.

  We could resolve this issue by normalizing the data. That is, keep it
  in only *one* place in the database, either the usr table or the
  artefact table. For performance reasons, it would probably be better
  to "de-artefact" them, because these fields are all frequently
  accessed by, for instance, the display_name() function. Removing them
  from the "usr" table would require us to run database joins against
  the artefact table, which is often huge and could slow things down. In
  comparison, removing them as artefacts would just complicate the
  Elasticsearch plugin and the User Profile block a little bit.

To manage notifications about this bug go to:
https://bugs.launchpad.net/mahara/+bug/1635441/+subscriptions


References