← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1779194] [NEW] neutron-lbaas haproxy agent, when configured with allow_automatic_lbaas_agent_failover = True, after failover, when the failed agent restarts or reconnects to RabbitMQ, it tries to unplug the vif port without checking if it is used by other agent

 

Public bug reported:

When we configure two or more lbaas haproxy agents with high
availability by setting the  allow_automatic_lbaas_agent_failover to
True for failover, then the LBaaS fails over to an available active
agent, either when the agent is not responsive or the agent lost
connection with RabitMQ.

This works exactly as per the expectation.

But when the dead agent comes up active and when it trys to re-sync the
state with the server, the agent finds the LBaaS configured or
associated with that agent is an 'Orphan' and tries to clean up the
Orphan LBaaS.

In the process of cleaning it up, it tries to unplug the VIF port, which
affects the other agent that is hosting the LBaaS.

When the VIF port is unplugged, the port device_owner changes and it
causes other issues.

So there should be a check before the VIF port is removed, to make sure,
if there is an active agent using the port. In that case the VIF port
should not be unplugged.

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: lbaas

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1779194

Title:
  neutron-lbaas haproxy agent, when configured with
  allow_automatic_lbaas_agent_failover = True,  after failover, when the
  failed agent restarts or reconnects to RabbitMQ, it tries to unplug
  the vif port without checking if it is used by other agent

Status in neutron:
  New

Bug description:
  When we configure two or more lbaas haproxy agents with high
  availability by setting the  allow_automatic_lbaas_agent_failover to
  True for failover, then the LBaaS fails over to an available active
  agent, either when the agent is not responsive or the agent lost
  connection with RabitMQ.

  This works exactly as per the expectation.

  But when the dead agent comes up active and when it trys to re-sync
  the state with the server, the agent finds the LBaaS configured or
  associated with that agent is an 'Orphan' and tries to clean up the
  Orphan LBaaS.

  In the process of cleaning it up, it tries to unplug the VIF port,
  which affects the other agent that is hosting the LBaaS.

  When the VIF port is unplugged, the port device_owner changes and it
  causes other issues.

  So there should be a check before the VIF port is removed, to make
  sure, if there is an active agent using the port. In that case the VIF
  port should not be unplugged.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1779194/+subscriptions


Follow ups