← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1784074] [NEW] Instances end up with no cell assigned in instance_mappings

 

Public bug reported:

There has been situations where due to an unrelated issue such as an RPC
or DB problem, the nova_api instance_mappings table can end up with
instances that have cell_id set to NULL which can cause annoying and
weird behaviour such as undeletable instances, etc.

This seems to be an issue only during times where these external
infrastructure components had issues.  I have come up with the following
script which loops over all cells and checks where they are, and spits
out a mysql query to run to fix.

This would be nice to have as a nova-manage cell_v2 command to help if
any other users run into this, unfortunately I'm a bit short on time so
I don't have time to nova-ify it, but it's here:

========================================================================
#!/usr/bin/env python

import urlparse

import pymysql


# Connect to databases
api_conn = pymysql.connect(host='xxxx', port=3306, user='nova_api', passwd='xxxxxxx', db='nova_api')
api_cur = api_conn.cursor()

def _get_conn(db):
  parsed_url = urlparse.urlparse(db)
  conn = pymysql.connect(host=parsed_url.hostname, user=parsed_url.username, passwd=parsed_url.password, db=parsed_url.path[1:])
  return conn.cursor()

# Get list of all cells
api_cur.execute("SELECT uuid, name, database_connection FROM cell_mappings")
CELLS = [{'uuid': uuid, 'name': name, 'db': _get_conn(db)} for uuid, name, db in api_cur.fetchall()]

# Get list of all unmapped instances
api_cur.execute("SELECT instance_uuid FROM instance_mappings WHERE cell_id IS NULL")
print "Number of unmapped instances: %s" % api_cur.rowcount

# Go over all unmapped instances
for (instance_uuid,) in api_cur.fetchall():
  instance_cell = None

  # Check which cell contains this instance
  for cell in CELLS:
    cell['db'].execute("SELECT id FROM instances WHERE uuid = %s", (instance_uuid,))

    if cell['db'].rowcount != 0:
      instance_cell = cell
      break

  # Update to the correct cell
  if instance_cell:
    print "UPDATE instance_mappings SET cell_id = '%s' WHERE instance_uuid = '%s'" % (instance_cell['uuid'], instance_uuid)
    continue

  # If we reach this point, it's not in any cell?!
  print "%s: not found in any cell" % (instance_uuid)
========================================================================

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1784074

Title:
  Instances end up with no cell assigned in instance_mappings

Status in OpenStack Compute (nova):
  New

Bug description:
  There has been situations where due to an unrelated issue such as an
  RPC or DB problem, the nova_api instance_mappings table can end up
  with instances that have cell_id set to NULL which can cause annoying
  and weird behaviour such as undeletable instances, etc.

  This seems to be an issue only during times where these external
  infrastructure components had issues.  I have come up with the
  following script which loops over all cells and checks where they are,
  and spits out a mysql query to run to fix.

  This would be nice to have as a nova-manage cell_v2 command to help if
  any other users run into this, unfortunately I'm a bit short on time
  so I don't have time to nova-ify it, but it's here:

  ========================================================================
  #!/usr/bin/env python

  import urlparse

  import pymysql

  
  # Connect to databases
  api_conn = pymysql.connect(host='xxxx', port=3306, user='nova_api', passwd='xxxxxxx', db='nova_api')
  api_cur = api_conn.cursor()

  def _get_conn(db):
    parsed_url = urlparse.urlparse(db)
    conn = pymysql.connect(host=parsed_url.hostname, user=parsed_url.username, passwd=parsed_url.password, db=parsed_url.path[1:])
    return conn.cursor()

  # Get list of all cells
  api_cur.execute("SELECT uuid, name, database_connection FROM cell_mappings")
  CELLS = [{'uuid': uuid, 'name': name, 'db': _get_conn(db)} for uuid, name, db in api_cur.fetchall()]

  # Get list of all unmapped instances
  api_cur.execute("SELECT instance_uuid FROM instance_mappings WHERE cell_id IS NULL")
  print "Number of unmapped instances: %s" % api_cur.rowcount

  # Go over all unmapped instances
  for (instance_uuid,) in api_cur.fetchall():
    instance_cell = None

    # Check which cell contains this instance
    for cell in CELLS:
      cell['db'].execute("SELECT id FROM instances WHERE uuid = %s", (instance_uuid,))

      if cell['db'].rowcount != 0:
        instance_cell = cell
        break

    # Update to the correct cell
    if instance_cell:
      print "UPDATE instance_mappings SET cell_id = '%s' WHERE instance_uuid = '%s'" % (instance_cell['uuid'], instance_uuid)
      continue

    # If we reach this point, it's not in any cell?!
    print "%s: not found in any cell" % (instance_uuid)
  ========================================================================

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1784074/+subscriptions


Follow ups