yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #78893
[Bug 1824445] Re: nova-manage cellv2 discover_hosts traces when run in parallel
Reviewed: https://review.opendev.org/651947
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5c544c7e2a7e266d69a9d0f0bf3ee8a0c636202b
Submitter: Zuul
Branch: master
commit 5c544c7e2a7e266d69a9d0f0bf3ee8a0c636202b
Author: melanie witt <melwittt@xxxxxxxxx>
Date: Fri Apr 12 00:32:01 2019 +0000
Warn for duplicate host mappings during discover_hosts
When the 'nova-manage cellv2 discover_hosts' command is run in parallel
during a deployment, it results in simultaneous attempts to map the
same compute or service hosts at the same time, resulting in
tracebacks:
"DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate
entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\")
[SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id,
host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s,
%(host)s)'] [parameters: {'host': u'compute-0.localdomain',
%'cell_id': 5, 'created_at': datetime.datetime(2019, 4, 10, 15, 20,
%50, 527925), 'updated_at': None}]
This adds more information to the command help and adds a warning
message when duplicate host mappings are detected with guidance about
how to run the command. The command will return 2 if a duplicate host
mapping is encountered and the documentation is updated to explain
this.
This also adds a warning to the scheduler periodic task to recommend
enabling the periodic on only one scheduler to prevent collisions.
We choose to warn and stop instead of ignoring DBDuplicateEntry because
there could potentially be a large number of parallel tasks competing
to insert duplicate records where only one can succeed. If we ignore
and continue to the next record, the large number of tasks will
repeatedly collide in a tight loop until all get through the entire
list of compute hosts that are being mapped. So we instead stop the
colliding task and emit a message.
Closes-Bug: #1824445
Change-Id: Ia7718ce099294e94309103feb9cc2397ff8f5188
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1824445
Title:
nova-manage cellv2 discover_hosts traces when run in parallel
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Saw this issue downstream [1] and found a couple of similar issues
[2][3] where deployments were running the 'nova-manage cellv2
discover_hosts' command in parallel and experiencing tracebacks like
this:
"DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate
entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\")
[SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id,
host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s, %(host)s)']
[parameters: {'host': u'compute-0.localdomain', 'cell_id': 5,
'created_at': datetime.datetime(2019, 4, 10, 15, 20, 50, 527925),
'updated_at': None}] (Background on this error at:
http://sqlalche.me/e/gkpj)",
After some discussion on IRC today [4], we concluded it would be best
to address the situation with improved command help and warnings when
duplicate host mappings are encountered.
While we could try-except to ignore DBDuplicateEntry, this is not a
situation we want to hide from users as it means they are likely
hammering their database with parallel updates that are mostly not
going to succeed. So we should stop and warn instead.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1698630
[2] https://bugs.launchpad.net/openstack-ansible/+bug/1752540
[3] https://github.com/bloomberg/chef-bcpc/issues/1378
[4] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-04-11.log.html#t2019-04-11T23:04:47
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1824445/+subscriptions
References