← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1421049] [NEW] Remove dvr router interface consume much time

 

Public bug reported:

In my environment, I create a DVR router with only one subnet(cidr is 10.0.0.0/8) attached to this router, then I create 10000 ports in this subnet, when I use 'router-interface-delete' to remove this subnet from router, it consume much time to return, I analyse the reason is bellow:
1. when 'remove_router_interface', it will notify l3 agent  'routers_updated', 
2. in this _notification, it will schedule_routers this router
    def _notification(self, context, method, router_ids, operation,
                      shuffle_agents):
        """Notify all the agents that are hosting the routers."""
        plugin = manager.NeutronManager.get_service_plugins().get(
            service_constants.L3_ROUTER_NAT)
        if not plugin:
            LOG.error(_LE('No plugin for L3 routing registered. Cannot notify '
                          'agents with the message %s'), method)
            return
        if utils.is_extension_supported(
                plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
            adminContext = (context.is_admin and
                            context or context.elevated())
            plugin.schedule_routers(adminContext, router_ids)
            self._agent_notification(
                context, method, router_ids, operation, shuffle_agents)
3. in _schedule_router it will get the candidates l3 agent, but in 'get_l3_agent_candidates' it will check 'check_ports_exist_on_l3agent'

            if agent_mode in ('legacy', 'dvr_snat') and (
                not is_router_distributed):
                candidates.append(l3_agent)
            elif is_router_distributed and agent_mode.startswith('dvr') and (
                self.check_ports_exist_on_l3agent(
                    context, l3_agent, sync_router['id'])):
                candidates.append(l3_agent)

4. but for 'remove_router_interface', it has deleted the router interface before do schedule, so the 'get_subnet_ids_on_router' will 
return a empty list, then use this list as filter to get ports, if port number are very large, it will consume much time

    def check_ports_exist_on_l3agent(self, context, l3_agent, router_id):
        """
        This function checks for existence of dvr serviceable
        ports on the host, running the input l3agent.
        """
        subnet_ids = self.get_subnet_ids_on_router(context, router_id)

        core_plugin = manager.NeutronManager.get_plugin()
        filter = {'fixed_ips': {'subnet_id': subnet_ids}}
        ports = core_plugin.get_ports(context, filters=filter)

so I think when 'remove_router_interface', it should not reschedule
router

** Affects: neutron
     Importance: Undecided
     Assignee: shihanzhang (shihanzhang)
         Status: New

** Changed in: neutron
     Assignee: (unassigned) => shihanzhang (shihanzhang)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1421049

Title:
  Remove dvr router interface consume much time

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  In my environment, I create a DVR router with only one subnet(cidr is 10.0.0.0/8) attached to this router, then I create 10000 ports in this subnet, when I use 'router-interface-delete' to remove this subnet from router, it consume much time to return, I analyse the reason is bellow:
  1. when 'remove_router_interface', it will notify l3 agent  'routers_updated', 
  2. in this _notification, it will schedule_routers this router
      def _notification(self, context, method, router_ids, operation,
                        shuffle_agents):
          """Notify all the agents that are hosting the routers."""
          plugin = manager.NeutronManager.get_service_plugins().get(
              service_constants.L3_ROUTER_NAT)
          if not plugin:
              LOG.error(_LE('No plugin for L3 routing registered. Cannot notify '
                            'agents with the message %s'), method)
              return
          if utils.is_extension_supported(
                  plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
              adminContext = (context.is_admin and
                              context or context.elevated())
              plugin.schedule_routers(adminContext, router_ids)
              self._agent_notification(
                  context, method, router_ids, operation, shuffle_agents)
  3. in _schedule_router it will get the candidates l3 agent, but in 'get_l3_agent_candidates' it will check 'check_ports_exist_on_l3agent'

              if agent_mode in ('legacy', 'dvr_snat') and (
                  not is_router_distributed):
                  candidates.append(l3_agent)
              elif is_router_distributed and agent_mode.startswith('dvr') and (
                  self.check_ports_exist_on_l3agent(
                      context, l3_agent, sync_router['id'])):
                  candidates.append(l3_agent)

  4. but for 'remove_router_interface', it has deleted the router interface before do schedule, so the 'get_subnet_ids_on_router' will 
  return a empty list, then use this list as filter to get ports, if port number are very large, it will consume much time

      def check_ports_exist_on_l3agent(self, context, l3_agent, router_id):
          """
          This function checks for existence of dvr serviceable
          ports on the host, running the input l3agent.
          """
          subnet_ids = self.get_subnet_ids_on_router(context, router_id)

          core_plugin = manager.NeutronManager.get_plugin()
          filter = {'fixed_ips': {'subnet_id': subnet_ids}}
          ports = core_plugin.get_ports(context, filters=filter)

  so I think when 'remove_router_interface', it should not reschedule
  router

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1421049/+subscriptions


Follow ups

References