← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1342257] [NEW] Nova cells can die unexpectedly on boot due to db failure

 

Public bug reported:

We have seen a crash in the cells booting process with the following
traceback:

2014-07-15 01:00:07.688 3070 CRITICAL nova [req-badc12a2-4ad9-4209-bcd4-f2429e134820 None] DBError: (1030, 'Got error 28 from storage engine')
2014-07-15 01:00:07.688 3070 TRACE nova Traceback (most recent call last):
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/current/nova/bin/nova-cells", line 13, in <module>
2014-07-15 01:00:07.688 3070 TRACE nova     sys.exit(main())
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cmd/cells.py", line 45, in main
2014-07-15 01:00:07.688 3070 TRACE nova     manager=CONF.cells.manager)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/service.py", line 275, in create
2014-07-15 01:00:07.688 3070 TRACE nova     db_allowed=db_allowed)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/service.py", line 148, in __init__
2014-07-15 01:00:07.688 3070 TRACE nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/manager.py", line 90, in __init__
2014-07-15 01:00:07.688 3070 TRACE nova     self.state_manager = cell_state_manager()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 161, in __new__
2014-07-15 01:00:07.688 3070 TRACE nova     return CellStateManagerDB(cell_state_cls)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 174, in __init__
2014-07-15 01:00:07.688 3070 TRACE nova     self._cell_data_sync(force=True)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 325, in inner
2014-07-15 01:00:07.688 3070 TRACE nova     return f(*args, **kwargs)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 436, in _cell_data_sync
2014-07-15 01:00:07.688 3070 TRACE nova     db_cells = self.db.cell_get_all(ctxt)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 1599, in cell_get_all
2014-07-15 01:00:07.688 3070 TRACE nova     return IMPL.cell_get_all(context)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 93, in __getattr__
2014-07-15 01:00:07.688 3070 TRACE nova     return getattr(self._db_api, key)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 85, in _db_api
2014-07-15 01:00:07.688 3070 TRACE nova     backend_mapping=_BACKEND_MAPPING)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/db/api.py", line 128, in __init__
2014-07-15 01:00:07.688 3070 TRACE nova     self._load_backend()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/db/api.py", line 143, in _load_backend
2014-07-15 01:00:07.688 3070 TRACE nova     self._backend = backend_mod.get_backend()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 42, in get_backend
2014-07-15 01:00:07.688 3070 TRACE nova     return API()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 75, in __init__
2014-07-15 01:00:07.688 3070 TRACE nova     self._launch_monitor()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 89, in _launch_monitor
2014-07-15 01:00:07.688 3070 TRACE nova     self._check_schema()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 56, in inner
2014-07-15 01:00:07.688 3070 TRACE nova     result = f(*args, **kwargs)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 81, in _check_schema
2014-07-15 01:00:07.688 3070 TRACE nova     schema = conn.get_schema()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 186, in get_schema
2014-07-15 01:00:07.688 3070 TRACE nova     tables = self._get_tables()
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 169, in _get_tables
2014-07-15 01:00:07.688 3070 TRACE nova     columns = self._get_columns(table_name)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 159, in _get_columns
2014-07-15 01:00:07.688 3070 TRACE nova     cursor = self.execute('DESCRIBE %s' % name)
2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 79, in inner
2014-07-15 01:00:07.688 3070 TRACE nova     raise db_exc.DBError(e)
2014-07-15 01:00:07.688 3070 TRACE nova DBError: (1030, 'Got error 28 from storage engine')
2014-07-15 01:00:07.688 3070 TRACE nova

Since this is a DB issue it seems the process should at the very least
retry.

** Affects: nova
     Importance: Medium
     Assignee: Christopher Lefelhocz (christopher-lefelhoc)
         Status: New


** Tags: cells

** Description changed:

  We have seen a crash in the cells booting process with the following
  traceback:
+ 
+ 2014-07-15 01:00:07.688 3070 CRITICAL nova [req-badc12a2-4ad9-4209-bcd4-f2429e134820 None] DBError: (1030, 'Got error 28 from storage engine')
+ 2014-07-15 01:00:07.688 3070 TRACE nova Traceback (most recent call last):
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/current/nova/bin/nova-cells", line 13, in <module>
+ 2014-07-15 01:00:07.688 3070 TRACE nova     sys.exit(main())
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cmd/cells.py", line 45, in main
+ 2014-07-15 01:00:07.688 3070 TRACE nova     manager=CONF.cells.manager)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/service.py", line 275, in create
+ 2014-07-15 01:00:07.688 3070 TRACE nova     db_allowed=db_allowed)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/service.py", line 148, in __init__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self.manager = manager_class(host=self.host, *args, **kwargs)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/manager.py", line 90, in __init__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self.state_manager = cell_state_manager()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 161, in __new__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     return CellStateManagerDB(cell_state_cls)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 174, in __init__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self._cell_data_sync(force=True)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 325, in inner
+ 2014-07-15 01:00:07.688 3070 TRACE nova     return f(*args, **kwargs)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 436, in _cell_data_sync
+ 2014-07-15 01:00:07.688 3070 TRACE nova     db_cells = self.db.cell_get_all(ctxt)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 1599, in cell_get_all
+ 2014-07-15 01:00:07.688 3070 TRACE nova     return IMPL.cell_get_all(context)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 93, in __getattr__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     return getattr(self._db_api, key)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 85, in _db_api
+ 2014-07-15 01:00:07.688 3070 TRACE nova     backend_mapping=_BACKEND_MAPPING)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/db/api.py", line 128, in __init__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self._load_backend()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/db/api.py", line 143, in _load_backend
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self._backend = backend_mod.get_backend()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 42, in get_backend
+ 2014-07-15 01:00:07.688 3070 TRACE nova     return API()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 75, in __init__
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self._launch_monitor()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 89, in _launch_monitor
+ 2014-07-15 01:00:07.688 3070 TRACE nova     self._check_schema()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 56, in inner
+ 2014-07-15 01:00:07.688 3070 TRACE nova     result = f(*args, **kwargs)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 81, in _check_schema
+ 2014-07-15 01:00:07.688 3070 TRACE nova     schema = conn.get_schema()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 186, in get_schema
+ 2014-07-15 01:00:07.688 3070 TRACE nova     tables = self._get_tables()
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 169, in _get_tables
+ 2014-07-15 01:00:07.688 3070 TRACE nova     columns = self._get_columns(table_name)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 159, in _get_columns
+ 2014-07-15 01:00:07.688 3070 TRACE nova     cursor = self.execute('DESCRIBE %s' % name)
+ 2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 79, in inner
+ 2014-07-15 01:00:07.688 3070 TRACE nova     raise db_exc.DBError(e)
+ 2014-07-15 01:00:07.688 3070 TRACE nova DBError: (1030, 'Got error 28 from storage engine')
+ 2014-07-15 01:00:07.688 3070 TRACE nova
+ 
+ Since this is a DB issue it seems the process should at the very least
+ retry.

** Changed in: nova
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1342257

Title:
  Nova cells can die unexpectedly on boot due to db failure

Status in OpenStack Compute (Nova):
  New

Bug description:
  We have seen a crash in the cells booting process with the following
  traceback:

  2014-07-15 01:00:07.688 3070 CRITICAL nova [req-badc12a2-4ad9-4209-bcd4-f2429e134820 None] DBError: (1030, 'Got error 28 from storage engine')
  2014-07-15 01:00:07.688 3070 TRACE nova Traceback (most recent call last):
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/current/nova/bin/nova-cells", line 13, in <module>
  2014-07-15 01:00:07.688 3070 TRACE nova     sys.exit(main())
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cmd/cells.py", line 45, in main
  2014-07-15 01:00:07.688 3070 TRACE nova     manager=CONF.cells.manager)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/service.py", line 275, in create
  2014-07-15 01:00:07.688 3070 TRACE nova     db_allowed=db_allowed)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/service.py", line 148, in __init__
  2014-07-15 01:00:07.688 3070 TRACE nova     self.manager = manager_class(host=self.host, *args, **kwargs)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/manager.py", line 90, in __init__
  2014-07-15 01:00:07.688 3070 TRACE nova     self.state_manager = cell_state_manager()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 161, in __new__
  2014-07-15 01:00:07.688 3070 TRACE nova     return CellStateManagerDB(cell_state_cls)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 174, in __init__
  2014-07-15 01:00:07.688 3070 TRACE nova     self._cell_data_sync(force=True)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 325, in inner
  2014-07-15 01:00:07.688 3070 TRACE nova     return f(*args, **kwargs)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/cells/state.py", line 436, in _cell_data_sync
  2014-07-15 01:00:07.688 3070 TRACE nova     db_cells = self.db.cell_get_all(ctxt)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 1599, in cell_get_all
  2014-07-15 01:00:07.688 3070 TRACE nova     return IMPL.cell_get_all(context)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 93, in __getattr__
  2014-07-15 01:00:07.688 3070 TRACE nova     return getattr(self._db_api, key)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/api.py", line 85, in _db_api
  2014-07-15 01:00:07.688 3070 TRACE nova     backend_mapping=_BACKEND_MAPPING)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/db/api.py", line 128, in __init__
  2014-07-15 01:00:07.688 3070 TRACE nova     self._load_backend()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/openstack/common/db/api.py", line 143, in _load_backend
  2014-07-15 01:00:07.688 3070 TRACE nova     self._backend = backend_mod.get_backend()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 42, in get_backend
  2014-07-15 01:00:07.688 3070 TRACE nova     return API()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 75, in __init__
  2014-07-15 01:00:07.688 3070 TRACE nova     self._launch_monitor()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 89, in _launch_monitor
  2014-07-15 01:00:07.688 3070 TRACE nova     self._check_schema()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 56, in inner
  2014-07-15 01:00:07.688 3070 TRACE nova     result = f(*args, **kwargs)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/api.py", line 81, in _check_schema
  2014-07-15 01:00:07.688 3070 TRACE nova     schema = conn.get_schema()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 186, in get_schema
  2014-07-15 01:00:07.688 3070 TRACE nova     tables = self._get_tables()
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 169, in _get_tables
  2014-07-15 01:00:07.688 3070 TRACE nova     columns = self._get_columns(table_name)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 159, in _get_columns
  2014-07-15 01:00:07.688 3070 TRACE nova     cursor = self.execute('DESCRIBE %s' % name)
  2014-07-15 01:00:07.688 3070 TRACE nova   File "/opt/rackstack/863.0/nova/lib/python2.6/site-packages/nova/db/mysqldb/connection.py", line 79, in inner
  2014-07-15 01:00:07.688 3070 TRACE nova     raise db_exc.DBError(e)
  2014-07-15 01:00:07.688 3070 TRACE nova DBError: (1030, 'Got error 28 from storage engine')
  2014-07-15 01:00:07.688 3070 TRACE nova

  Since this is a DB issue it seems the process should at the very least
  retry.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1342257/+subscriptions


Follow ups

References