← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1610483] [NEW] Pluggable IPAM rollback mechanism is not robust

 

Public bug reported:

In looking through the retry mechanism for pluggable IPAM (e.g. [1]), I
found it is not robust. It catches only a very narrow set of errors.
Many other errors would not result in a rollback notification to the
external IPAM system. Basically, if anything else fails during a port
create and causes the DB transaction to be rolled back, the IP
allocations will be forgotten by Neutron but an external IPAM will still
remember them. No notification will be sent to the external system to
reverse what it had done.

There are a couple of options we could pursue. One is a decorator on the
API operation which would take care to call rollback if anything went
wrong. The other is to use an sqlalchemy level hook,
after_transaction_end, to detect DB rollback and call IPAM rollback.

In both cases, the problem is where/how to do the book-keeping. We need
to immediately record successful (de)allocations from the external IPAM
system somewhere where that will be available in the event rollback is
needed. One ideas is to piggy-back off of the context in session.info or
somewhere like that. This discussion in IRC [2] might be useful.

[1] https://github.com/openstack/neutron/blob/949aae6a8b92a77a06d04734bf82ed7a917057a7/neutron/db/ipam_pluggable_backend.py#L129-L136
[2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-08-03.log.html#t2016-08-03T18:08:58

** Affects: neutron
     Importance: High
         Status: Confirmed


** Tags: l3-ipam-dhcp

** Changed in: neutron
       Status: New => Confirmed

** Changed in: neutron
   Importance: Undecided => High

** Tags added: l3-ipam-dhcp

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1610483

Title:
  Pluggable IPAM rollback mechanism is not robust

Status in neutron:
  Confirmed

Bug description:
  In looking through the retry mechanism for pluggable IPAM (e.g. [1]),
  I found it is not robust. It catches only a very narrow set of errors.
  Many other errors would not result in a rollback notification to the
  external IPAM system. Basically, if anything else fails during a port
  create and causes the DB transaction to be rolled back, the IP
  allocations will be forgotten by Neutron but an external IPAM will
  still remember them. No notification will be sent to the external
  system to reverse what it had done.

  There are a couple of options we could pursue. One is a decorator on
  the API operation which would take care to call rollback if anything
  went wrong. The other is to use an sqlalchemy level hook,
  after_transaction_end, to detect DB rollback and call IPAM rollback.

  In both cases, the problem is where/how to do the book-keeping. We
  need to immediately record successful (de)allocations from the
  external IPAM system somewhere where that will be available in the
  event rollback is needed. One ideas is to piggy-back off of the
  context in session.info or somewhere like that. This discussion in IRC
  [2] might be useful.

  [1] https://github.com/openstack/neutron/blob/949aae6a8b92a77a06d04734bf82ed7a917057a7/neutron/db/ipam_pluggable_backend.py#L129-L136
  [2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-08-03.log.html#t2016-08-03T18:08:58

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1610483/+subscriptions


Follow ups