yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #36748
[Bug 1483601] [NEW] l2 population failed when bulk live migrate VMs
Public bug reported:
when we bulk live migrate VMs, the l2 population may possiblly(not always) failed at destination compute nodes, because when nova migrate VM at destination compute node, it just update port's binding:host, the port's status is still active, from neutron perspective, the progress of port status is : active -> build -> active,
in bellow case, l2 population will fail:
1. nova successfully live migrate vm A and VM B from compute A to compute B.
2. port A and port B status are active, binding:host are compute B .
3. l2 agent scans these two port, then handle them one by one.
4. neutron-server firstly handle port A, its status will be build(remember port B status is still active), and do bellow check
in l2 population check, this check will be fail
def _update_port_up(self, context):
......
if agent_active_ports == 1 or (self.get_agent_uptime(agent) < cfg.CONF.l2pop.agent_boot_time):
# First port activated on current agent in this network,
# we have to provide it with the whole list of fdb entries
** Affects: neutron
Importance: Undecided
Status: New
** Description changed:
when we bulk live migrate VMs, the l2 population may possiblly(not always) failed at destination compute nodes,
because when nova migrate VM at destination compute node, it just update port's binding:host, the port's status
- is still active, from neutron perspective, the progress of port status is : active -> build -> active,
+ is still active, from neutron perspective, the progress of port status is : active -> build -> active,
in bellow case, l2 population will fail:
1. nova successfully live migrate vm A and VM B from compute A to compute B.
2. port A and port B status are active, binding:host are compute B .
3. l2 agent scans these two port, then handle them one by one.
- 4. neutron-server firstly handle port A, its status will be build(remember port B status is still active), and do bellow check
+ 4. neutron-server firstly handle port A, its status will be build(remember port B status is still active), and do bellow check
in l2 population check, this check will be fail
- def _update_port_up(self, context):
- ......
- if agent_active_ports == 1 or (
- self.get_agent_uptime(agent) < cfg.CONF.l2pop.agent_boot_time):
- # First port activated on current agent in this network,
- # we have to provide it with the whole list of fdb entries
+ def _update_port_up(self, context):
+ ......
+ if agent_active_ports == 1 or (self.get_agent_uptime(agent) < cfg.CONF.l2pop.agent_boot_time):
+ # First port activated on current agent in this network,
+ # we have to provide it with the whole list of fdb entries
** Description changed:
- when we bulk live migrate VMs, the l2 population may possiblly(not always) failed at destination compute nodes,
- because when nova migrate VM at destination compute node, it just update port's binding:host, the port's status
- is still active, from neutron perspective, the progress of port status is : active -> build -> active,
+ when we bulk live migrate VMs, the l2 population may possiblly(not always) failed at destination compute nodes, because when nova migrate VM at destination compute node, it just update port's binding:host, the port's status is still active, from neutron perspective, the progress of port status is : active -> build -> active,
in bellow case, l2 population will fail:
1. nova successfully live migrate vm A and VM B from compute A to compute B.
2. port A and port B status are active, binding:host are compute B .
3. l2 agent scans these two port, then handle them one by one.
4. neutron-server firstly handle port A, its status will be build(remember port B status is still active), and do bellow check
in l2 population check, this check will be fail
def _update_port_up(self, context):
......
if agent_active_ports == 1 or (self.get_agent_uptime(agent) < cfg.CONF.l2pop.agent_boot_time):
# First port activated on current agent in this network,
# we have to provide it with the whole list of fdb entries
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1483601
Title:
l2 population failed when bulk live migrate VMs
Status in neutron:
New
Bug description:
when we bulk live migrate VMs, the l2 population may possiblly(not always) failed at destination compute nodes, because when nova migrate VM at destination compute node, it just update port's binding:host, the port's status is still active, from neutron perspective, the progress of port status is : active -> build -> active,
in bellow case, l2 population will fail:
1. nova successfully live migrate vm A and VM B from compute A to compute B.
2. port A and port B status are active, binding:host are compute B .
3. l2 agent scans these two port, then handle them one by one.
4. neutron-server firstly handle port A, its status will be build(remember port B status is still active), and do bellow check
in l2 population check, this check will be fail
def _update_port_up(self, context):
......
if agent_active_ports == 1 or (self.get_agent_uptime(agent) < cfg.CONF.l2pop.agent_boot_time):
# First port activated on current agent in this network,
# we have to provide it with the whole list of fdb entries
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1483601/+subscriptions
Follow ups