← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1538387] Re: fdb_chg_ip_tun throwing exception because fdb_entries not in correct format

 

Reviewed:  https://review.openstack.org/272986
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8dcf39aae7a099e01bd322891526c134e87e6b1b
Submitter: Jenkins
Branch:    master

commit 8dcf39aae7a099e01bd322891526c134e87e6b1b
Author: Kevin Benton <blak111@xxxxxxxxx>
Date:   Wed Jan 27 02:17:01 2016 -0800

    Unmarshall portinfo on update_fdb_entries calls
    
    The unmarshalling function was not aware of the data
    structure used by update_fdb_entries, so it would not
    setup PortInfo named tuples in the 'before' and 'after'
    fields. This would break the fdb_chg_ip_tun function
    which expected to be able to use named attributes.
    
    This patch adjusts the unmarshalling function to be aware
    of this datastrucure.
    
    This has likely been broken since the change that added
    named tuples here: I7f8c93b0e12ee0179bb23dfbb3a3d814615b1c2e
    It probably went undetected for so long because the exception
    will only be observed when the updated entry does not have
    an agent IP that matches the local agent's (i.e. not single-node).
    Even in a multi-node environment, this would only trigger an
    error when the fixed_ips of a port changed so it wouldn't show
    up in a normal port wiring life-cycle.
    
    Closes-Bug: #1538387
    Change-Id: I0aacb3af9ebd160ebfb801f77b186075303c3df5


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1538387

Title:
  fdb_chg_ip_tun throwing exception because fdb_entries not in correct
  format

Status in neutron:
  Fix Released

Bug description:
  I've been trying to track down failures in the DVR multinode job.  I'm
  now tripping over this one.

  For context I've been focusing on a single change, but if you see a
  failure in the gate-tempest-dsvm-neutron-dvr-multinode-full job you'll
  probably be able to find similar info.  This is the change:

  http://logs.openstack.org/77/177777/4/check/gate-tempest-dsvm-neutron-
  dvr-multinode-full/5abca7b/logs/

  The screen-q-agt log shows a traceback here:

  http://logs.openstack.org/77/177777/4/check/gate-tempest-dsvm-neutron-
  dvr-multinode-
  full/5abca7b/logs/screen-q-agt.txt.gz#_2016-01-18_10_11_29_715

  <snip>
  2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/l2pop/rpc_manager/l2population_rpc.py", line 312, in fdb_chg_ip_tun
  2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher     mac_ip.mac_address,
  2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher AttributeError: 'list' object has no attribute 'mac_address'
  2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher

  The info passed to fdb_chg_ip_tun() should have a "PortInfo"
  namedtuple as data, but from the line before we can see it doesn't:

  DEBUG neutron.plugins.ml2.drivers.l2pop.rpc_manager.l2population_rpc
  [req-671e8634-c753-4002-acfd-68515dd44f29 None None]
  neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent.OVSNeutronAgent
  method fdb_chg_ip_tun called with arguments (<neutron.context.Context
  object at 0x7fa2ee6988d0>,
  <neutron.plugins.ml2.drivers.openvswitch.agent.openflow.ovs_ofctl.br_tun.DeferredOVSTunnelBridge
  object at 0x7fa2ee79abd0>, {u'c4ae0757-e3e5-419c-b2ba-4d7817964237':
  {u'10.209.192.28': {u'before': [[u'fa:16:3e:8d:2e:48',
  u'2003:0:0:1::1']]}}}, '10.208.193.94', {u'7ca0dcf2-fb63-4959-92ee-
  cc757da8d120':

  So from this it's clear that _unmarshall_fdb_entries() either hasn't
  been called, or didn't do anything.

  Looking over in screen-q-svc.log for the info before the RPC call
  finds:

  DEBUG neutron.plugins.ml2.drivers.l2pop.rpc [req-
  f32790a5-0160-47b9-89b4-763b9c23bc08 tempest-
  TestGettingAddress-2071048693 tempest-TestGettingAddress-1817548879]
  Fanout notify l2population agents at q-agent-notifier the message
  update_fdb_entries with {'chg_ip': {u'c4ae0757-e3e5-419c-b2ba-
  4d7817964237': {u'10.208.193.94': {'before':
  [PortInfo(mac_address=u'fa:16:3e:8d:2e:48',
  ip_address=u'2003:0:0:1::1')]}}}} _notification_fanout
  /opt/stack/new/neutron/neutron/plugins/ml2/drivers/l2pop/rpc.py:47

  This is the message right before _marshall_fdb_entries() was called to
  convert the PortInfo into [<mac>, <ip>] pairs, and from the above it
  looks like it did.

  I'm just starting to look at this now, but maybe someone more familiar
  with l2pop has a guess at what's broken.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1538387/+subscriptions


References