yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #82002
[Bug 1866937] Re: Requests to neutron API do not use retries
Reviewed: https://review.opendev.org/712226
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0e34ed9733e3f23d162e3e428795807386abf1cb
Submitter: Zuul
Branch: master
commit 0e34ed9733e3f23d162e3e428795807386abf1cb
Author: melanie witt <melwittt@xxxxxxxxx>
Date: Wed Mar 11 02:26:52 2020 +0000
Add config option for neutron client retries
Nova can occasionally fail to carry out server actions which require
calls to neutron API if haproxy happens to close a connection after
idle time if an incoming request attempts to re-use the connection
while it is being torn down.
In order to be more resilient to this scenario, we can add a config
option for neutron client to retry requests, similar to our existing
CONF.cinder.http_retries and CONF.glance.num_retries options.
Because we create our neutron client [1] using a keystoneauth1 session
[2], we can set the 'connect_retries' keyword argument to let
keystoneauth1 handle connection retries.
Closes-Bug: #1866937
[1] https://github.com/openstack/nova/blob/57459c3429ce62975cebd0cd70936785bdf2f3a4/nova/network/neutron.py#L226-L237
[2] https://docs.openstack.org/keystoneauth/latest/api/keystoneauth1.session.html#keystoneauth1.session.Session
Change-Id: Ifb3afb13aff7e103c2e80ade817d0e63b624604a
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1866937
Title:
Requests to neutron API do not use retries
Status in OpenStack Compute (nova):
Fix Released
Bug description:
We have a customer bug report downstream [1] where nova occasionally
fails to carry out server actions requiring calls to neutron API if
haproxy happens to close a connection after idle time of 10 seconds at
nearly the same time as an incoming request that attempts to re-use
the connection while it is being torn down. Here is an excerpt from
[1]:
The result of our investigation, the cause is follows.
1. neutron-client in nova uses connection pool ( urllib3/requests )
for http.
2. Sometimes, http connection is reused for different requests.
3. Connection between neutron-client and haproxy is closed from
haproxy when it is in idle for 10 seconds.
4. If reusing connection from client side and closing connection from haproxy side are happend almost same time,
client gets RST and end with "bad status line".
To address this problem, we can add a new config option for neutron
client (similar to the existing config options we have for cinder
client and glance client retries) to be more resilient during such
scenarios.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1788853
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1866937/+subscriptions
References