← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1824445] Re: nova-manage cellv2 discover_hosts traces when run in parallel

 

Reviewed:  https://review.opendev.org/651947
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5c544c7e2a7e266d69a9d0f0bf3ee8a0c636202b
Submitter: Zuul
Branch:    master

commit 5c544c7e2a7e266d69a9d0f0bf3ee8a0c636202b
Author: melanie witt <melwittt@xxxxxxxxx>
Date:   Fri Apr 12 00:32:01 2019 +0000

    Warn for duplicate host mappings during discover_hosts
    
    When the 'nova-manage cellv2 discover_hosts' command is run in parallel
    during a deployment, it results in simultaneous attempts to map the
    same compute or service hosts at the same time, resulting in
    tracebacks:
    
      "DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate
      entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\")
      [SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id,
      host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s,
      %(host)s)'] [parameters: {'host': u'compute-0.localdomain',
      %'cell_id': 5, 'created_at': datetime.datetime(2019, 4, 10, 15, 20,
      %50, 527925), 'updated_at': None}]
    
    This adds more information to the command help and adds a warning
    message when duplicate host mappings are detected with guidance about
    how to run the command. The command will return 2 if a duplicate host
    mapping is encountered and the documentation is updated to explain
    this.
    
    This also adds a warning to the scheduler periodic task to recommend
    enabling the periodic on only one scheduler to prevent collisions.
    
    We choose to warn and stop instead of ignoring DBDuplicateEntry because
    there could potentially be a large number of parallel tasks competing
    to insert duplicate records where only one can succeed. If we ignore
    and continue to the next record, the large number of tasks will
    repeatedly collide in a tight loop until all get through the entire
    list of compute hosts that are being mapped. So we instead stop the
    colliding task and emit a message.
    
    Closes-Bug: #1824445
    
    Change-Id: Ia7718ce099294e94309103feb9cc2397ff8f5188


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1824445

Title:
  nova-manage cellv2 discover_hosts traces when run in parallel

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Saw this issue downstream [1] and found a couple of similar issues
  [2][3] where deployments were running the 'nova-manage cellv2
  discover_hosts' command in parallel and experiencing tracebacks like
  this:

  "DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate
  entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\")
  [SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id,
  host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s, %(host)s)']
  [parameters: {'host': u'compute-0.localdomain', 'cell_id': 5,
  'created_at': datetime.datetime(2019, 4, 10, 15, 20, 50, 527925),
  'updated_at': None}] (Background on this error at:
  http://sqlalche.me/e/gkpj)",

  After some discussion on IRC today [4], we concluded it would be best
  to address the situation with improved command help and warnings when
  duplicate host mappings are encountered.

  While we could try-except to ignore DBDuplicateEntry, this is not a
  situation we want to hide from users as it means they are likely
  hammering their database with parallel updates that are mostly not
  going to succeed. So we should stop and warn instead.

  [1] https://bugzilla.redhat.com/show_bug.cgi?id=1698630
  [2] https://bugs.launchpad.net/openstack-ansible/+bug/1752540
  [3] https://github.com/bloomberg/chef-bcpc/issues/1378
  [4] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-04-11.log.html#t2019-04-11T23:04:47

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1824445/+subscriptions


References