yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88874
[Bug 1973656] [NEW] meaning of option "router_auto_schedule" is ambiguous
Public bug reported:
I found meaning of option "router_auto_schedule" is hard to follow. A
quick code review finds it is only used at (tests excluded)
```python
def get_router_ids(self, context, host):
"""Returns IDs of routers scheduled to l3 agent on <host>
This will autoschedule unhosted routers to l3 agent on <host> and then
return all ids of routers scheduled to it.
"""
if extensions.is_extension_supported(
self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
if cfg.CONF.router_auto_schedule:
self.l3plugin.auto_schedule_routers(context, host)
return self.l3plugin.list_router_ids_on_host(context, host)
```
which seems to be fixing router without agents associated with it. And
even if I turn this option off, router is still able to be properly
scheduled to agents. because
```python
@registry.receives(resources.ROUTER, [events.AFTER_CREATE],
priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
def _after_router_create(self, resource, event, trigger, context,
router_id, router, router_db, **kwargs):
if not router['ha']:
return
try:
self.schedule_router(context, router_id)
router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
self._notify_router_updated(context, router_id)
except Exception as e:
with excutils.save_and_reraise_exception() as ctx:
if isinstance(e, l3ha_exc.NoVRIDAvailable):
ctx.reraise = False
LOG.warning("No more VRIDs for router: %s", e)
else:
LOG.exception("Failed to schedule HA router %s.",
router_id)
router['status'] = self._update_router_db(
context, router_id,
{'status': constants.ERROR})['status']
```
seems to not respecting this option.
So IMO auto_schedule_router might better be renamed to something like
`fix_dangling_routers` etc and could be turned off if user wants to fix
wrong routers manually. The reason is that could router by agent is
pretty expensive for a relatively large deployment with around 10,000
routers.
** Affects: neutron
Importance: Undecided
Status: New
** Description changed:
I found meaning of option "router_auto_schedule" is hard to follow. A
- quick code review finds it is only used at
+ quick code review finds it is only used at (tests excluded)
```python
- def get_router_ids(self, context, host):
- """Returns IDs of routers scheduled to l3 agent on <host>
+ def get_router_ids(self, context, host):
+ """Returns IDs of routers scheduled to l3 agent on <host>
- This will autoschedule unhosted routers to l3 agent on <host> and then
- return all ids of routers scheduled to it.
- """
- if extensions.is_extension_supported(
- self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
- if cfg.CONF.router_auto_schedule:
- self.l3plugin.auto_schedule_routers(context, host)
- return self.l3plugin.list_router_ids_on_host(context, host)
+ This will autoschedule unhosted routers to l3 agent on <host> and then
+ return all ids of routers scheduled to it.
+ """
+ if extensions.is_extension_supported(
+ self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
+ if cfg.CONF.router_auto_schedule:
+ self.l3plugin.auto_schedule_routers(context, host)
+ return self.l3plugin.list_router_ids_on_host(context, host)
```
which seems to be fixing router without agents associated with it. And
even if I turn this option off, router is still able to be properly
scheduled to agents. because
```python
- @registry.receives(resources.ROUTER, [events.AFTER_CREATE],
- priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
- def _after_router_create(self, resource, event, trigger, context,
- router_id, router, router_db, **kwargs):
- if not router['ha']:
- return
- try:
- self.schedule_router(context, router_id)
- router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
- self._notify_router_updated(context, router_id)
- except Exception as e:
- with excutils.save_and_reraise_exception() as ctx:
- if isinstance(e, l3ha_exc.NoVRIDAvailable):
- ctx.reraise = False
- LOG.warning("No more VRIDs for router: %s", e)
- else:
- LOG.exception("Failed to schedule HA router %s.",
- router_id)
- router['status'] = self._update_router_db(
- context, router_id,
- {'status': constants.ERROR})['status']
+ @registry.receives(resources.ROUTER, [events.AFTER_CREATE],
+ priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
+ def _after_router_create(self, resource, event, trigger, context,
+ router_id, router, router_db, **kwargs):
+ if not router['ha']:
+ return
+ try:
+ self.schedule_router(context, router_id)
+ router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
+ self._notify_router_updated(context, router_id)
+ except Exception as e:
+ with excutils.save_and_reraise_exception() as ctx:
+ if isinstance(e, l3ha_exc.NoVRIDAvailable):
+ ctx.reraise = False
+ LOG.warning("No more VRIDs for router: %s", e)
+ else:
+ LOG.exception("Failed to schedule HA router %s.",
+ router_id)
+ router['status'] = self._update_router_db(
+ context, router_id,
+ {'status': constants.ERROR})['status']
```
seems to not respecting this option.
So IMO auto_schedule_router might better be renamed to something like
`fix_dangling_routers` etc and could be turned off if user wants to fix
wrong routers manually. The reason is that could router by agent is
pretty expensive for a relatively large deployment with around 10,000
routers.
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1973656
Title:
meaning of option "router_auto_schedule" is ambiguous
Status in neutron:
New
Bug description:
I found meaning of option "router_auto_schedule" is hard to follow. A
quick code review finds it is only used at (tests excluded)
```python
def get_router_ids(self, context, host):
"""Returns IDs of routers scheduled to l3 agent on <host>
This will autoschedule unhosted routers to l3 agent on <host> and then
return all ids of routers scheduled to it.
"""
if extensions.is_extension_supported(
self.l3plugin, constants.L3_AGENT_SCHEDULER_EXT_ALIAS):
if cfg.CONF.router_auto_schedule:
self.l3plugin.auto_schedule_routers(context, host)
return self.l3plugin.list_router_ids_on_host(context, host)
```
which seems to be fixing router without agents associated with it. And
even if I turn this option off, router is still able to be properly
scheduled to agents. because
```python
@registry.receives(resources.ROUTER, [events.AFTER_CREATE],
priority_group.PRIORITY_ROUTER_EXTENDED_ATTRIBUTE)
def _after_router_create(self, resource, event, trigger, context,
router_id, router, router_db, **kwargs):
if not router['ha']:
return
try:
self.schedule_router(context, router_id)
router['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
self._notify_router_updated(context, router_id)
except Exception as e:
with excutils.save_and_reraise_exception() as ctx:
if isinstance(e, l3ha_exc.NoVRIDAvailable):
ctx.reraise = False
LOG.warning("No more VRIDs for router: %s", e)
else:
LOG.exception("Failed to schedule HA router %s.",
router_id)
router['status'] = self._update_router_db(
context, router_id,
{'status': constants.ERROR})['status']
```
seems to not respecting this option.
So IMO auto_schedule_router might better be renamed to something like
`fix_dangling_routers` etc and could be turned off if user wants to
fix wrong routers manually. The reason is that could router by agent
is pretty expensive for a relatively large deployment with around
10,000 routers.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1973656/+subscriptions