yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #78421
[Bug 1824911] Re: [scale issue] the bottleneck lock will multiply increase processing time of agent resources
Reviewed: https://review.opendev.org/656164
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0f471a47c073eb0cf2ed68c30482e1ae71ff6927
Submitter: Zuul
Branch: master
commit 0f471a47c073eb0cf2ed68c30482e1ae71ff6927
Author: LIU Yulong <i@xxxxxxxxxxxx>
Date: Fri Apr 26 15:42:32 2019 +0800
Async notify neutron-server for HA states
RPC notifier method can sometimes be time-consuming,
this will cause other parallel processing resources
fail to send notifications in time. This patch changes
the notify to asynchronous.
Closes-Bug: #1824911
Change-Id: I3f555a0c78fbc02d8214f12b62c37d140bc71da1
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1824911
Title:
[scale issue] the bottleneck lock will multiply increase processing
time of agent resources
Status in neutron:
Fix Released
Bug description:
Env: stable/queens
CentOS 7.5
kernel-3.10.0-862.11.6.el7.x86_64
There are many bottleneck locks in the agent extensions. For instance, l3 agent extensions now have lock 'qos-fip' [1], 'qos-gateway-ip' [2], 'port-forwarding' [3], 'log-port' [4]. For L2 agent, it is 'qos-port' lock [5].
For these agent extensions when a large number of resources need to be processed in parallel, the processing time may get longer and longer. Let's take the l3 agent extension as the example. Because, firstly, the more time for one router processing time, the more lock waiting time for others. Then, if every router have a large set of resource need to be done in the extension (for example, floating IP QoS tc rules), every router processing will hold the lock for a bit long time, and more waiting time for others then. It shows a trend of multiple growth.
[1] https://github.com/openstack/neutron/blob/master/neutron/agent/l3/extensions/qos/fip.py#L283
[2] https://github.com/openstack/neutron/blob/master/neutron/agent/l3/extensions/qos/gateway_ip.py#L84
[3] https://github.com/openstack/neutron/blob/master/neutron/agent/l3/extensions/port_forwarding.py#L426
[4] https://github.com/openstack/neutron/blob/master/neutron/services/logapi/agent/l3/base.py#L96
[5] https://github.com/openstack/neutron/blob/master/neutron/agent/l2/extensions/qos.py#L241
Here are some Logs from L3 agent, you may see there are some
'add_router' which hold the lock for 16.271s and another wait the lock
32.547s.
2019-04-12 09:38:31.526 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 9.697s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:38:31.526 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" acquired by "neutron.notifiers.batch_notifier.synced_send" :: waited 12.753s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:38:35.435 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 3.909s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:38:35.435 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" acquired by "neutron.notifiers.batch_notifier.synced_send" :: waited 16.216s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:38:37.435 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 2.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:38:37.436 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" acquired by "neutron.notifiers.batch_notifier.synced_send" :: waited 16.639s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:38:39.436 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 2.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:38:39.436 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" acquired by "neutron.notifiers.batch_notifier.synced_send" :: waited 16.376s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:38:41.437 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 2.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:38:41.437 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" acquired by "neutron.notifiers.batch_notifier.synced_send" :: waited 15.856s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:38:43.437 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 2.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:40:49.268 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" released by "neutron.agent.l3.extensions.qos.fip.add_router" :: held 1.797s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:03.908 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" released by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: held 14.640s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:03.909 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" acquired by "neutron.agent.l3.extensions.port_forwarding.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:03.909 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" released by "neutron.agent.l3.extensions.port_forwarding.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:11.143 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" released by "neutron.agent.l3.extensions.qos.fip.add_router" :: held 18.025s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:11.144 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" acquired by "neutron.agent.l3.extensions.qos.fip.add_router" :: waited 14.209s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:15.428 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" released by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: held 4.285s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:20.867 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" released by "neutron.agent.l3.extensions.qos.fip.add_router" :: held 9.723s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:20.867 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" acquired by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:20.868 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" acquired by "neutron.agent.l3.extensions.qos.fip.add_router" :: waited 22.125s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:21.768 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" released by "neutron.agent.l3.extensions.qos.fip.add_router" :: held 0.900s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:21.768 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" acquired by "neutron.agent.l3.extensions.qos.fip.add_router" :: waited 18.762s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:24.712 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" acquired by "neutron.agent.linux.pd.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:24.713 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" released by "neutron.agent.linux.pd.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:27.642 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" released by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: held 6.775s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:27.642 40627 DEBUG oslo_concurrency.lockutils [-] Lock "floatingip_snat" acquired by "neutron.agent.l3.extensions.floatingip_snat.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:27.642 40627 DEBUG oslo_concurrency.lockutils [-] Lock "floatingip_snat" released by "neutron.agent.l3.extensions.floatingip_snat.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:27.643 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" acquired by "neutron.agent.l3.extensions.port_forwarding.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:27.643 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" released by "neutron.agent.l3.extensions.port_forwarding.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:27.643 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" acquired by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: waited 5.876s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:37.360 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" released by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: held 9.716s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:37.360 40627 DEBUG oslo_concurrency.lockutils [-] Lock "floatingip_snat" acquired by "neutron.agent.l3.extensions.floatingip_snat.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:37.360 40627 DEBUG oslo_concurrency.lockutils [-] Lock "floatingip_snat" released by "neutron.agent.l3.extensions.floatingip_snat.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:37.361 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" acquired by "neutron.agent.l3.extensions.port_forwarding.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:37.361 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" released by "neutron.agent.l3.extensions.port_forwarding.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:38.039 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" released by "neutron.agent.l3.extensions.qos.fip.add_router" :: held 16.271s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:38.039 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" acquired by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:38.040 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" acquired by "neutron.agent.l3.extensions.qos.fip.add_router" :: waited 32.547s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:41.661 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" acquired by "neutron.agent.linux.pd.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:41.662 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" released by "neutron.agent.linux.pd.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:42.328 40627 DEBUG oslo_concurrency.lockutils [-] Lock "enable_radvd" acquired by "neutron.agent.l3.ha_router.enable_radvd" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:42.331 40627 DEBUG oslo_concurrency.lockutils [-] Lock "enable_radvd" released by "neutron.agent.l3.ha_router.enable_radvd" :: held 0.003s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:42.331 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" acquired by "neutron.agent.linux.pd.process_ha_state" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:42.331 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" released by "neutron.agent.linux.pd.process_ha_state" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:42.332 40627 DEBUG oslo_concurrency.lockutils [-] Lock "resize_greenpool" acquired by "neutron.agent.l3.agent._resize_process_pool" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:41:42.333 40627 DEBUG oslo_concurrency.lockutils [-] Lock "resize_greenpool" released by "neutron.agent.l3.agent._resize_process_pool" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:41:42.333 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" acquired by "neutron.notifiers.batch_notifier.synced_send" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils
.py:273
2019-04-12 09:41:47.155 40627 DEBUG oslo_concurrency.lockutils [-] Lock "notifier-58a2b315-d411-4aa0-bb23-7c0da0b57a70" released by "neutron.notifiers.batch_notifier.synced_send" :: held 4.821s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.p
y:285
2019-04-12 09:55:15.600 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" released by "neutron.agent.l3.extensions.qos.fip.add_router" :: held 65.798s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:55:15.600 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" acquired by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:55:15.601 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-fip" acquired by "neutron.agent.l3.extensions.qos.fip.add_router" :: waited 307.347s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:55:51.892 40627 DEBUG oslo_concurrency.lockutils [-] Lock "qos-gateway-ip" released by "neutron.agent.l3.extensions.qos.gateway_ip.add_router" :: held 36.292s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:55:51.892 40627 DEBUG oslo_concurrency.lockutils [-] Lock "floatingip_snat" acquired by "neutron.agent.l3.extensions.floatingip_snat.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:55:51.892 40627 DEBUG oslo_concurrency.lockutils [-] Lock "floatingip_snat" released by "neutron.agent.l3.extensions.floatingip_snat.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:55:51.892 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" acquired by "neutron.agent.l3.extensions.port_forwarding.add_router" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:55:51.893 40627 DEBUG oslo_concurrency.lockutils [-] Lock "port-forwarding" released by "neutron.agent.l3.extensions.port_forwarding.add_router" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:56:00.703 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" acquired by "neutron.agent.linux.pd.process_ha_state" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:56:00.703 40627 DEBUG oslo_concurrency.lockutils [-] Lock "l3-agent-pd" released by "neutron.agent.linux.pd.process_ha_state" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
2019-04-12 09:56:00.704 40627 DEBUG oslo_concurrency.lockutils [-] Lock "resize_greenpool" acquired by "neutron.agent.l3.agent._resize_process_pool" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:273
2019-04-12 09:56:00.705 40627 DEBUG oslo_concurrency.lockutils [-] Lock "resize_greenpool" released by "neutron.agent.l3.agent._resize_process_pool" :: held 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:285
Potential solution:
Minimizing agent extension lock granularity.
So the code basiclly are same as master branch, IMO, we should refactor these lock mechanism. Find out the real critical resource, and only lock for it.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1824911/+subscriptions
References