← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1957189] [NEW] DVR Router Update Error

 

Public bug reported:

Hi, we are getting the error below when removing a tenant network port
from a router.

 - pyroute2.netlink.exceptions.NetlinkError: (3, 'No such process')

This situation happens with the following scenario:
  
  1. Create a subnet from a "subnet pool" with custom CIDR prefix.
  2. Add interface from this subnet to router. (After adding both router_interface_distributed and router_centralized_snat ports created on router)
  3. If network is not used anymore then we are trying to delete it. (At first deleting the instances)
  4. Remove the added interface for this subnet from the router.
  5. Remove the subnet.
  6. Now try to create a subnet from the same "subnet pool" with same CIRD. We are getting the same CIDR subnet which was deleted before.
  7. Add interface from this subnet to router. (After adding only router_interface_distributed port created on router)
  8. Create an instance from the network which uses that tenant subnet only. Instance DNS queries are not working! (This is the step which we recognized that something is wrong)
  9. Delete the created instances. (Instance create-delete step is optional)
  10. Now try removing the added interface for this subnet from the router. Now we are getting "pyroute2.netlink.exceptions.NetlinkError" error when "router updates" on L3 Agent until hitting the retry limit for router update.

  Until restarting the Neutron L3 Agent on all controller nodes:
    - We always getting that error when adding the interface from that subnet to router. If we are getting this error then port shown as DOWN on router page and router_centralized_snat does not exist.
    - We always getting the same error when deleting the interface which we already added.

  11. Restart the Neutron L3 Agents and everything is ok.


Do you have any idea about this situation? We are using OVS and DVR.

After adding subnet to router why sometimes we saw only one port? 
  - network:router_interface_distributed (Always exist after attaching port to router)
  - network:router_centralized_snat (Sometimes exist when attaching port to router)


Environment Details:
 OpenStack Victoria Cluster installed via kolla-ansible to Ubuntu 20.04.2 LTS Hosts. (Kernel:5.4.0-80-generic)
 There exist 5 controller+network node.
 "neutron-openvswitch-agent", "neutron-l3-agent" and "neutron-server" version is "17.2.2.dev46"
 OpenvSwitch used in DVR mode with router HA configured. (l3_ha = true)
 We are using a single centralized neutron router for connecting all tenant networks to provider network.
 We are using bgp_dragent to announce unique tenant networks.
 Tenant network type: vxlan
 External network type: vlan

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: dvr errorpyroute2.netlink.exceptions.netlinkerror router update

** Attachment added: "neutron-bug-logs.txt"
   https://bugs.launchpad.net/bugs/1957189/+attachment/5553593/+files/neutron-bug-logs.txt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1957189

Title:
  DVR Router Update Error

Status in neutron:
  New

Bug description:
  Hi, we are getting the error below when removing a tenant network port
  from a router.

   - pyroute2.netlink.exceptions.NetlinkError: (3, 'No such process')

  This situation happens with the following scenario:
    
    1. Create a subnet from a "subnet pool" with custom CIDR prefix.
    2. Add interface from this subnet to router. (After adding both router_interface_distributed and router_centralized_snat ports created on router)
    3. If network is not used anymore then we are trying to delete it. (At first deleting the instances)
    4. Remove the added interface for this subnet from the router.
    5. Remove the subnet.
    6. Now try to create a subnet from the same "subnet pool" with same CIRD. We are getting the same CIDR subnet which was deleted before.
    7. Add interface from this subnet to router. (After adding only router_interface_distributed port created on router)
    8. Create an instance from the network which uses that tenant subnet only. Instance DNS queries are not working! (This is the step which we recognized that something is wrong)
    9. Delete the created instances. (Instance create-delete step is optional)
    10. Now try removing the added interface for this subnet from the router. Now we are getting "pyroute2.netlink.exceptions.NetlinkError" error when "router updates" on L3 Agent until hitting the retry limit for router update.

    Until restarting the Neutron L3 Agent on all controller nodes:
      - We always getting that error when adding the interface from that subnet to router. If we are getting this error then port shown as DOWN on router page and router_centralized_snat does not exist.
      - We always getting the same error when deleting the interface which we already added.

    11. Restart the Neutron L3 Agents and everything is ok.

  
  Do you have any idea about this situation? We are using OVS and DVR.

  After adding subnet to router why sometimes we saw only one port? 
    - network:router_interface_distributed (Always exist after attaching port to router)
    - network:router_centralized_snat (Sometimes exist when attaching port to router)

  
  Environment Details:
   OpenStack Victoria Cluster installed via kolla-ansible to Ubuntu 20.04.2 LTS Hosts. (Kernel:5.4.0-80-generic)
   There exist 5 controller+network node.
   "neutron-openvswitch-agent", "neutron-l3-agent" and "neutron-server" version is "17.2.2.dev46"
   OpenvSwitch used in DVR mode with router HA configured. (l3_ha = true)
   We are using a single centralized neutron router for connecting all tenant networks to provider network.
   We are using bgp_dragent to announce unique tenant networks.
   Tenant network type: vxlan
   External network type: vlan

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1957189/+subscriptions