← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1623990] Re: ml2 "while True" assumes fresh data on transactions

 

Reviewed:  https://review.openstack.org/370920
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1279640da845d4f6bc6274e6706875610e179326
Submitter: Jenkins
Branch:    master

commit 1279640da845d4f6bc6274e6706875610e179326
Author: Kevin Benton <kevin@xxxxxxxxxx>
Date:   Wed Sep 14 22:35:02 2016 -0700

    Expire DB objects in ML2 infinity loops
    
    We must expire all DB objects in the session to ensure
    that we are getting the latest object state from the DB
    on each iteration of these 'while True:' loops to ensure
    we don't continue forever on stale data.
    
    Closes-Bug: #1623990
    Change-Id: I2f4ea6cce5d83c13fd37650d28a7089c5aa9a4c0


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1623990

Title:
  ml2 "while True" assumes fresh data on transactions

Status in neutron:
  Fix Released

Bug description:
  both delete_network and delete_subnet in ML2 use the following
  pattern:

  while True:
      with session.begin():
          record = get_thing_from_db()
          ... Logic to determine if out of transaction cleaning should be done
          if no_cleaning:
              context.session.delete(record)

      do_cleaning_up()


  The problem here is that it assumes it will get fresh data on each
  iteration. However, due to to the identity map pattern followed by
  sqlalchemy[1], new data will not be reflected in the 'record' var
  above on subsequent iterations if the primary key still exists in the
  DB.

  This can lead to infinite loops on delete_subnet if a concurrent
  delete request (or network update) bumps the network revision number.
  This is because the network that is in the session will always have
  the stale revision number even though the lookup methods are called on
  each iteration.

  
  This can be reproduced with the following steps:
  1. Place a pdb breakpoint here: https://github.com/openstack/neutron/blob/27928c0ddfb8d62843aa72ecf943d1f46ef30099/neutron/plugins/ml2/plugin.py#L1062
  2. create a network with two subnets (ensure the dhcp agent is running so it gets an IP on each)
  3. issue an API subnet delete call
  4. first time breakpoint hits, nothing is looked up yet, so just continue
  5. second time breakpoint hits, issue the following query in mysql: 'update standardattributes set revision_number=500000;'. this will alter the revision number in the DB but the loop won't ever get the new one.
  6. continue
  7. continue
  8. continue
  9. do this as long as you want :)


  
  1. http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#is-the-session-a-cache

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1623990/+subscriptions


References