← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1986003] Re: Exception in concurrent port binding activation

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/853281
Committed: https://opendev.org/openstack/neutron/commit/5b4ed5b117f2f418d598af20744f571db581e2ae
Submitter: "Zuul (22348)"
Branch:    master

commit 5b4ed5b117f2f418d598af20744f571db581e2ae
Author: Bodo Petermann <b.petermann@xxxxxxxxxxxx>
Date:   Tue Aug 16 14:14:14 2022 +0200

    Fix concurrent port binding activate
    
    Fix an issue with concurrent requests to activate a port binding.
    If there are two activate requests in parallel, one might set the
    binding on the new host to active and the other request may
    not find the previously INACTIVE row anymore in
    _commit_port_binding and initializing the driver_context.PortContext
    crashed.
    
    Closes-Bug: #1986003
    Change-Id: I047e33062bc38f36848e0149c6e670cb5828c8e3


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1986003

Title:
  Exception in concurrent port binding activation

Status in neutron:
  Fix Released

Bug description:
  Occasionally VM live-migrations fail in post-migration because the request to activate the port binding on the new host fails with a 500 Internal Server Error.
  It appears that nova-compute might try two requests in parallel. One of them succeeds, the other one returns the error.

  Neutron version: yoga, 20.1.0

  How to reproduce:

  - create a port for a compute instance, with a binding to host host1
  - create an additional port binding for host2, i.e. POST /v2.0/ports/{port_id}/bindings
  - that will create the new binding with status=INACTIVE
  - activate the port binding with 2 requests in parallel (2 times PUT /v2.0/ports/{port_id}/bindings/host2/activate)

  Actual result:

  - one PUT request returns 200
  - other PUT request returns 500

  In neutron-server log the failed request logs an exception: "sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.NoneType' is not mapped."
  See https://paste.opendev.org/show/bFICeriQTlkmVwYQ5nzo/

  Expected result:

  - one PUT request returns 200
  - other PUT request returns 409 (port binding already active)

  Background:

  Nova live-migrations may trigger such concurrent activate requests.
  In preparation of the live-migration nova will create a new port binding for the destination host. When the migration completes it will activate that binding. At least in our setup that activation may be triggered from two places: (a) when the lifecycle event about completed migration is handled and (b) when the migration job monitor actively detects that the migration completed. If the 2nd one fails, the post-live-migration breaks and the whole migration goes into error state and may not finish all its work.

  Related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2097160

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1986003/+subscriptions



References