yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #91663
[Bug 1986003] Re: Exception in concurrent port binding activation
Reviewed: https://review.opendev.org/c/openstack/neutron/+/853281
Committed: https://opendev.org/openstack/neutron/commit/5b4ed5b117f2f418d598af20744f571db581e2ae
Submitter: "Zuul (22348)"
Branch: master
commit 5b4ed5b117f2f418d598af20744f571db581e2ae
Author: Bodo Petermann <b.petermann@xxxxxxxxxxxx>
Date: Tue Aug 16 14:14:14 2022 +0200
Fix concurrent port binding activate
Fix an issue with concurrent requests to activate a port binding.
If there are two activate requests in parallel, one might set the
binding on the new host to active and the other request may
not find the previously INACTIVE row anymore in
_commit_port_binding and initializing the driver_context.PortContext
crashed.
Closes-Bug: #1986003
Change-Id: I047e33062bc38f36848e0149c6e670cb5828c8e3
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1986003
Title:
Exception in concurrent port binding activation
Status in neutron:
Fix Released
Bug description:
Occasionally VM live-migrations fail in post-migration because the request to activate the port binding on the new host fails with a 500 Internal Server Error.
It appears that nova-compute might try two requests in parallel. One of them succeeds, the other one returns the error.
Neutron version: yoga, 20.1.0
How to reproduce:
- create a port for a compute instance, with a binding to host host1
- create an additional port binding for host2, i.e. POST /v2.0/ports/{port_id}/bindings
- that will create the new binding with status=INACTIVE
- activate the port binding with 2 requests in parallel (2 times PUT /v2.0/ports/{port_id}/bindings/host2/activate)
Actual result:
- one PUT request returns 200
- other PUT request returns 500
In neutron-server log the failed request logs an exception: "sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.NoneType' is not mapped."
See https://paste.opendev.org/show/bFICeriQTlkmVwYQ5nzo/
Expected result:
- one PUT request returns 200
- other PUT request returns 409 (port binding already active)
Background:
Nova live-migrations may trigger such concurrent activate requests.
In preparation of the live-migration nova will create a new port binding for the destination host. When the migration completes it will activate that binding. At least in our setup that activation may be triggered from two places: (a) when the lifecycle event about completed migration is handled and (b) when the migration job monitor actively detects that the migration completed. If the 2nd one fails, the post-live-migration breaks and the whole migration goes into error state and may not finish all its work.
Related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2097160
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1986003/+subscriptions
References