← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2006496] [NEW] [L3-HA] "max_l3_agents_per_router" not honored when the redundancy is reduced

 

Public bug reported:

Related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2166197

NOTE: Config option "router_distributed" must be False.

This issue is happening when initially we have a
"max_l3_agents_per_router" number, we create a router and then we reduce
the redundancy.

For example, if "max_l3_agents_per_router=3" and we create a HA router. Neutron will create 3 instances of this router and will create the corresponding "routerl3agentbindings" registers. E.g.:
MariaDB [ovs_neutron]> select * from routerl3agentbindings;
+--------------------------------------+--------------------------------------+---------------+
| router_id                            | l3_agent_id                          | binding_index |
+--------------------------------------+--------------------------------------+---------------+
| f8d3fec5-4648-48e9-b546-94f2e135df77 | 5c54529e-1f4e-4332-9674-96a1d15a16b2 |             1 |
| f8d3fec5-4648-48e9-b546-94f2e135df77 | 8112e03e-9191-495d-aea2-d2d7cd621767 |             2 |
| f8d3fec5-4648-48e9-b546-94f2e135df77 | 850beebc-3144-4673-a4cc-142162dba436 |             3 |
+--------------------------------------+--------------------------------------+---------------+


Now we reduce the redundancy to "max_l3_agents_per_router=1". If we remove the agent assignation for those registers with "binding_index" different to 1, the next time the router is updated, the L3 scheduler will create a new assignation with "binding_index=1". When the router is updated (a subnet is added or removed, a FIP is assigned or removed, etc), the scheduler is called. This method [1] will determine what is the next index that needs to be created (that is used both for the DHCP scheduler and the L3 scheduler).

In the given example, if the agents with "binding_index" different from 1 are removed, the vacant binding index method [1] will return 1:
  open_slots = sorted(list(all_indicies - set(binding_indices)))
  --> all_indicies = {1}
  --> binding_indices = {3}  # for example
  --> open_slots = {1}  # instead of an empty set(), as expected here.

[1]https://github.com/openstack/neutron/blob/7c3d6c414d3c0f085cae94b6f2186c4415a9298b/neutron/scheduler/base_scheduler.py#L102-L107

** Affects: neutron
     Importance: Medium
     Assignee: Rodolfo Alonso (rodolfo-alonso-hernandez)
         Status: New

** Changed in: neutron
     Assignee: (unassigned) => Rodolfo Alonso (rodolfo-alonso-hernandez)

** Changed in: neutron
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2006496

Title:
  [L3-HA] "max_l3_agents_per_router" not honored when the redundancy is
  reduced

Status in neutron:
  New

Bug description:
  Related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2166197

  NOTE: Config option "router_distributed" must be False.

  This issue is happening when initially we have a
  "max_l3_agents_per_router" number, we create a router and then we
  reduce the redundancy.

  For example, if "max_l3_agents_per_router=3" and we create a HA router. Neutron will create 3 instances of this router and will create the corresponding "routerl3agentbindings" registers. E.g.:
  MariaDB [ovs_neutron]> select * from routerl3agentbindings;
  +--------------------------------------+--------------------------------------+---------------+
  | router_id                            | l3_agent_id                          | binding_index |
  +--------------------------------------+--------------------------------------+---------------+
  | f8d3fec5-4648-48e9-b546-94f2e135df77 | 5c54529e-1f4e-4332-9674-96a1d15a16b2 |             1 |
  | f8d3fec5-4648-48e9-b546-94f2e135df77 | 8112e03e-9191-495d-aea2-d2d7cd621767 |             2 |
  | f8d3fec5-4648-48e9-b546-94f2e135df77 | 850beebc-3144-4673-a4cc-142162dba436 |             3 |
  +--------------------------------------+--------------------------------------+---------------+

  
  Now we reduce the redundancy to "max_l3_agents_per_router=1". If we remove the agent assignation for those registers with "binding_index" different to 1, the next time the router is updated, the L3 scheduler will create a new assignation with "binding_index=1". When the router is updated (a subnet is added or removed, a FIP is assigned or removed, etc), the scheduler is called. This method [1] will determine what is the next index that needs to be created (that is used both for the DHCP scheduler and the L3 scheduler).

  In the given example, if the agents with "binding_index" different from 1 are removed, the vacant binding index method [1] will return 1:
    open_slots = sorted(list(all_indicies - set(binding_indices)))
    --> all_indicies = {1}
    --> binding_indices = {3}  # for example
    --> open_slots = {1}  # instead of an empty set(), as expected here.

  [1]https://github.com/openstack/neutron/blob/7c3d6c414d3c0f085cae94b6f2186c4415a9298b/neutron/scheduler/base_scheduler.py#L102-L107

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2006496/+subscriptions



Follow ups