← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1902211] [NEW] Router State standby on all l3 agent when create

 

Public bug reported:

Hi all

When i create a router, it is stuck in state standby forever on all
agent. I login to agent and check, in /var/lib/neutron/ha_confs/<router-
id> , it does not have a keepalived.conf filem, so all agent stuck in
standby state.

Once stuck, I won't be able to create the router anymore, I have to
restart neutron-l3-agent to fix it. It creates the file keepalived.conf
and the router will be active in one agent, and I can continue to create
the router

I am running OpenStack Train

How can reproduce:
- I can't reproduce it, some times this error appear and i can't create router. I have to restart neutron-l3-agent to fix it. Some time later (maybe 1-3 days, i am not sure), I can't create routers again.

Debug in code:
1. When add_router to agent, i saw code stuck when StrongSwan IPSec sync status with server side. 
https://github.com/openstack/neutron-vpnaas/blob/2bea568b4cd4968dcbe64f55247a970545e911af/neutron_vpnaas/services/vpn/agent.py#L67

2. When code running in line above, i saw code not running to fuction sync 
https://github.com/openstack/neutron-vpnaas/blob/2bea568b4c/neutron_vpnaas/services/vpn/device_drivers/ipsec.py#L1085

3. I think i stuck when running decorator
https://github.com/openstack/oslo.concurrency/blob/80a6e1d489c5d650ea1ce47f4d81bd98bc803542/oslo_concurrency/lockutils.py#L351

4. Finally, this line will stuck forever and i don't know how to fix it
https://github.com/openstack/oslo.concurrency/blob/80a6e1d489c5d650ea1ce47f4d81bd98bc803542/oslo_concurrency/lockutils.py#L264
name=vpn-agent
lock_file_prefix=neutron-
external=False
lock_path=None
do_log=False
semaphores=None
delay=0.01
fair=False
int_lock=<threading.Semaphore object at 0x7fa2e04b00f0>


Please help me fix this bug. Thank you all

** Affects: neutron
     Importance: Undecided
         Status: New

** Summary changed:

- Create router standby on all l3 agent
+ Router State standby on all l3 agent when create

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1902211

Title:
  Router State standby on all l3 agent when create

Status in neutron:
  New

Bug description:
  Hi all

  When i create a router, it is stuck in state standby forever on all
  agent. I login to agent and check, in /var/lib/neutron/ha_confs
  /<router-id> , it does not have a keepalived.conf filem, so all agent
  stuck in standby state.

  Once stuck, I won't be able to create the router anymore, I have to
  restart neutron-l3-agent to fix it. It creates the file
  keepalived.conf and the router will be active in one agent, and I can
  continue to create the router

  I am running OpenStack Train

  How can reproduce:
  - I can't reproduce it, some times this error appear and i can't create router. I have to restart neutron-l3-agent to fix it. Some time later (maybe 1-3 days, i am not sure), I can't create routers again.

  Debug in code:
  1. When add_router to agent, i saw code stuck when StrongSwan IPSec sync status with server side. 
  https://github.com/openstack/neutron-vpnaas/blob/2bea568b4cd4968dcbe64f55247a970545e911af/neutron_vpnaas/services/vpn/agent.py#L67

  2. When code running in line above, i saw code not running to fuction sync 
  https://github.com/openstack/neutron-vpnaas/blob/2bea568b4c/neutron_vpnaas/services/vpn/device_drivers/ipsec.py#L1085

  3. I think i stuck when running decorator
  https://github.com/openstack/oslo.concurrency/blob/80a6e1d489c5d650ea1ce47f4d81bd98bc803542/oslo_concurrency/lockutils.py#L351

  4. Finally, this line will stuck forever and i don't know how to fix it
  https://github.com/openstack/oslo.concurrency/blob/80a6e1d489c5d650ea1ce47f4d81bd98bc803542/oslo_concurrency/lockutils.py#L264
  name=vpn-agent
  lock_file_prefix=neutron-
  external=False
  lock_path=None
  do_log=False
  semaphores=None
  delay=0.01
  fair=False
  int_lock=<threading.Semaphore object at 0x7fa2e04b00f0>

  
  Please help me fix this bug. Thank you all

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1902211/+subscriptions


Follow ups