← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1973656] [NEW] meaning of option "router_auto_schedule" is ambiguous

 

Public bug reported:

I found meaning of option "router_auto_schedule" is hard to follow.  A
quick code review finds it is only used at (tests excluded)

```python
    def get_router_ids(self, context, host):
        """Returns IDs of routers scheduled to l3 agent on <host>

        This will autoschedule unhosted routers to l3 agent on <host> and then
        return all ids of routers scheduled to it.
        """
        if extensions.is_extension_supported(
                self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
            if cfg.CONF.router_auto_schedule:
                self.l3plugin.auto_schedule_routers(context, host)
        return self.l3plugin.list_router_ids_on_host(context, host)
```

which seems to be fixing router without agents associated with it. And
even if I turn this option off, router is still able to be properly
scheduled to agents. because

```python
    @registry.receives(resources.ROUTER, [events.AFTER_CREATE],
                       priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
    def _after_router_create(self, resource, event, trigger, context,
                             router_id, router, router_db, **kwargs):
        if not router['ha']:
            return
        try:
            self.schedule_router(context, router_id)
            router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
            self._notify_router_updated(context, router_id)
        except Exception as e:
            with excutils.save_and_reraise_exception() as ctx:
                if isinstance(e, l3ha_exc.NoVRIDAvailable):
                    ctx.reraise = False
                    LOG.warning("No more VRIDs for router: %s", e)
                else:
                    LOG.exception("Failed to schedule HA router %s.",
                                  router_id)
                router['status'] = self._update_router_db(
                    context, router_id,
                    {'status': constants.ERROR})['status']
```

seems to not respecting this option.

So IMO auto_schedule_router might better be renamed to something like
`fix_dangling_routers` etc and could be turned off if user wants to fix
wrong routers manually. The reason is that could router by agent is
pretty expensive for a relatively large deployment with around 10,000
routers.

** Affects: neutron
     Importance: Undecided
         Status: New

** Description changed:

  I found meaning of option "router_auto_schedule" is hard to follow.  A
- quick code review finds it is only used at
+ quick code review finds it is only used at (tests excluded)
  
  ```python
-     def get_router_ids(self, context, host):
-         """Returns IDs of routers scheduled to l3 agent on <host>
+     def get_router_ids(self, context, host):
+         """Returns IDs of routers scheduled to l3 agent on <host>
  
-         This will autoschedule unhosted routers to l3 agent on <host> and then
-         return all ids of routers scheduled to it.
-         """
-         if extensions.is_extension_supported(
-                 self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
-             if cfg.CONF.router_auto_schedule:
-                 self.l3plugin.auto_schedule_routers(context, host)
-         return self.l3plugin.list_router_ids_on_host(context, host)
+         This will autoschedule unhosted routers to l3 agent on <host> and then
+         return all ids of routers scheduled to it.
+         """
+         if extensions.is_extension_supported(
+                 self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
+             if cfg.CONF.router_auto_schedule:
+                 self.l3plugin.auto_schedule_routers(context, host)
+         return self.l3plugin.list_router_ids_on_host(context, host)
  ```
  
  which seems to be fixing router without agents associated with it. And
  even if I turn this option off, router is still able to be properly
  scheduled to agents. because
  
  ```python
-     @registry.receives(resources.ROUTER, [events.AFTER_CREATE],
-                        priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
-     def _after_router_create(self, resource, event, trigger, context,
-                              router_id, router, router_db, **kwargs):
-         if not router['ha']:
-             return
-         try:
-             self.schedule_router(context, router_id)
-             router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
-             self._notify_router_updated(context, router_id)
-         except Exception as e:
-             with excutils.save_and_reraise_exception() as ctx:
-                 if isinstance(e, l3ha_exc.NoVRIDAvailable):
-                     ctx.reraise = False
-                     LOG.warning("No more VRIDs for router: %s", e)
-                 else:
-                     LOG.exception("Failed to schedule HA router %s.",
-                                   router_id)
-                 router['status'] = self._update_router_db(
-                     context, router_id,
-                     {'status': constants.ERROR})['status']
+     @registry.receives(resources.ROUTER, [events.AFTER_CREATE],
+                        priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
+     def _after_router_create(self, resource, event, trigger, context,
+                              router_id, router, router_db, **kwargs):
+         if not router['ha']:
+             return
+         try:
+             self.schedule_router(context, router_id)
+             router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
+             self._notify_router_updated(context, router_id)
+         except Exception as e:
+             with excutils.save_and_reraise_exception() as ctx:
+                 if isinstance(e, l3ha_exc.NoVRIDAvailable):
+                     ctx.reraise = False
+                     LOG.warning("No more VRIDs for router: %s", e)
+                 else:
+                     LOG.exception("Failed to schedule HA router %s.",
+                                   router_id)
+                 router['status'] = self._update_router_db(
+                     context, router_id,
+                     {'status': constants.ERROR})['status']
  ```
  
  seems to not respecting this option.
  
  So IMO auto_schedule_router might better be renamed to something like
  `fix_dangling_routers` etc and could be turned off if user wants to fix
  wrong routers manually. The reason is that could router by agent is
  pretty expensive for a relatively large deployment with around 10,000
  routers.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1973656

Title:
  meaning of option "router_auto_schedule" is ambiguous

Status in neutron:
  New

Bug description:
  I found meaning of option "router_auto_schedule" is hard to follow.  A
  quick code review finds it is only used at (tests excluded)

  ```python
      def get_router_ids(self, context, host):
          """Returns IDs of routers scheduled to l3 agent on <host>

          This will autoschedule unhosted routers to l3 agent on <host> and then
          return all ids of routers scheduled to it.
          """
          if extensions.is_extension_supported(
                  self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
              if cfg.CONF.router_auto_schedule:
                  self.l3plugin.auto_schedule_routers(context, host)
          return self.l3plugin.list_router_ids_on_host(context, host)
  ```

  which seems to be fixing router without agents associated with it. And
  even if I turn this option off, router is still able to be properly
  scheduled to agents. because

  ```python
      @registry.receives(resources.ROUTER, [events.AFTER_CREATE],
                         priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
      def _after_router_create(self, resource, event, trigger, context,
                               router_id, router, router_db, **kwargs):
          if not router['ha']:
              return
          try:
              self.schedule_router(context, router_id)
              router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
              self._notify_router_updated(context, router_id)
          except Exception as e:
              with excutils.save_and_reraise_exception() as ctx:
                  if isinstance(e, l3ha_exc.NoVRIDAvailable):
                      ctx.reraise = False
                      LOG.warning("No more VRIDs for router: %s", e)
                  else:
                      LOG.exception("Failed to schedule HA router %s.",
                                    router_id)
                  router['status'] = self._update_router_db(
                      context, router_id,
                      {'status': constants.ERROR})['status']
  ```

  seems to not respecting this option.

  So IMO auto_schedule_router might better be renamed to something like
  `fix_dangling_routers` etc and could be turned off if user wants to
  fix wrong routers manually. The reason is that could router by agent
  is pretty expensive for a relatively large deployment with around
  10,000 routers.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1973656/+subscriptions