yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #26877
[Bug 1410805] Re: NSX Plugin: UnAuthorizedRequest without retry
** Changed in: neutron
Assignee: (unassigned) => Han Zhou (zhouhan)
** Project changed: neutron => vmware-nsx
** Changed in: vmware-nsx
Status: New => In Progress
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1410805
Title:
NSX Plugin: UnAuthorizedRequest without retry
Status in VMware NSX:
In Progress
Bug description:
An occasional failure found in our production environment because of a
race condition in NSX plugin, which results in neutron port creation
failure. Here is the root cause.
When authentication cookie of NSX controller expires, it is expected that NSX plugin client will retry with new authentication cookie after re-login. The code logic is: when a connection is acquired, if cookie is None then it initiate a new login request for new cookie.
Otherwise, it continue to use old cookie to issue request, and if request failed because of authentication failure, it then checks if cookie is None then abort retrying.
However, there is a race condition that after connection is acquired
but before request is issued, there can be another thread failed to
authenticate and resetting the cookie (which is shared data between
connections to same controller) to None (so that during retry it will
know to re-login after acquiring the connection). So the first thread
will then abort retrying because of the logic checking the cookie and
aborting if the cookie is None.
This has high probability because there is a blocking operation after
connection is acquired: the sock.connect(). It is likely the cookie
being changed to None during this blocking operation.
The fix would be simply remove the abort logic, because even if the
cookie is None, it would then re-login and get a new cookie before
retry. Another fix could be moving the assignment of "cookie" before
the socket.connect() operation.
Related logs:
(note that the log is from havana version, but the code logic is the same in upstream)
---------------------------------------------------------------------
2015-01-13 06:44:21,326 64499216 INFO [neutron.plugins.nicira.api_client.client] [5672] Connection https://xxx:443 idle for 1614.74 seconds; reconnecting.
2015-01-13 06:44:21,326 64499216 DEBUG [neutron.plugins.nicira.api_client.client] [5672] Acquired connection https://xxx:443. 14 connection(s) available.
2015-01-13 06:44:21,327 64499216 DEBUG [neutron.plugins.nicira.api_client.request] [5672] Issuing - request GET https://xxx:443//ws.v1/lswitch/7d13735a-ca65-43b3-a3fb-ef8b1ca0f552/lport/5589905e-e531-4676-b0a0-40fe5589e60f?relations=LogicalPortStatus
2015-01-13 06:44:21,389 64496976 INFO [neutron.plugins.nicira.api_client.client] [5673] Connection https://xxx:443 idle for 5833.87 seconds; reconnecting.
2015-01-13 06:44:21,389 64496976 DEBUG [neutron.plugins.nicira.api_client.client] [5673] Acquired connection https://xxx:443. 13 connection(s) available.
2015-01-13 06:44:21,390 64496976 DEBUG [neutron.plugins.nicira.api_client.request] [5673] Issuing - request GET https://xxx:443//ws.v1/lswitch/3ac1df8b-cdcb-472c-8638-75f89b87bfa4?relations=LogicalSwitchStatus
2015-01-13 06:44:21,394 64499216 DEBUG [neutron.plugins.nicira.api_client.request] Setting X-Nvp-Wait-For-Config-Generation request header: '425260708'
2015-01-13 06:44:21,398 64499216 DEBUG [neutron.plugins.nicira.api_client.request] [5672] Completed request 'GET https://xxx:443//ws.v1/lswitch/7d13735a-ca65-43b3-a3fb-ef8b1ca0f552/lport/5589905e-e531-4676-b0a0-40fe5589e60f?relations=LogicalPortStatus': 401 (0.07 seconds)
2015-01-13 06:44:21,399 64499216 DEBUG [neutron.plugins.nicira.api_client.client] [5672] Released connection https://xxx:443. 14 connection(s) available.
2015-01-13 06:44:21,399 64499216 DEBUG [neutron.plugins.nicira.api_client.client] [5672] Acquired connection https://xxx:443. 13 connection(s) available.
2015-01-13 06:44:21,400 64496176 DEBUG [neutron.plugins.nicira.api_client.request] [5674] Issuing - request POST https://xxx:443//ws.v1/login
2015-01-13 06:44:21,400 64496176 DEBUG [neutron.plugins.nicira.api_client.request] Setting X-Nvp-Wait-For-Config-Generation request header: '425260708'
2015-01-13 06:44:21,403 64496976 DEBUG [neutron.plugins.nicira.api_client.request] Setting X-Nvp-Wait-For-Config-Generation request header: '425260708'
2015-01-13 06:44:21,439 64496976 DEBUG [neutron.plugins.nicira.api_client.request] [5673] Completed request 'GET https://xxx:443//ws.v1/lswitch/3ac1df8b-cdcb-472c-8638-75f89b87bfa4?relations=LogicalSwitchStatus': 401 (0.05 seconds)
2015-01-13 06:44:21,439 64496976 DEBUG [neutron.plugins.nicira.api_client.client] [5673] Released connection https://xxx:443. 14 connection(s) available.
2015-01-13 06:44:21,439 64496976 DEBUG [neutron.plugins.nicira.api_client.request_eventlet] [5673] Completed request 'GET /ws.v1/lswitch/3ac1df8b-cdcb-472c-8638-75f89b87bfa4?relations=LogicalSwitchStatus': 401
2015-01-13 06:44:21,439 57729424 ERROR [NeutronPlugin] An exception occured while selecting logical switch for the port
...
NvpPluginException: An unexpected error occurred in the NVP Plugin:An exception occured while selecting logical switch for the port
2015-01-13 06:44:21,584 64496176 DEBUG [neutron.plugins.nicira.api_client.request] [5674] Completed request 'POST https://xxx:443//ws.v1/login': 200 (0.18 seconds)
2015-01-13 06:44:21,585 64496176 DEBUG [neutron.plugins.nicira.api_client.request_eventlet] [5674] Completed request 'POST /ws.v1/login': 200
2015-01-13 06:44:21,585 64499216 DEBUG [neutron.plugins.nicira.api_client.client_eventlet] Saving new authentication cookie 'nvp_sessionid=15d4602ff7fb3dc34c00901ef80a7cc4; Path=/; secure'
2015-01-13 06:44:21,585 64499216 DEBUG [neutron.plugins.nicira.api_client.request] [5672] Issuing - request GET https://xxx:443//ws.v1/lswitch/7d13735a-ca65-43b3-a3fb-ef8b1ca0f552/lport/5589905e-e531-4676-b0a0-40fe5589e60f?relations=LogicalPortStatus
2015-01-13 06:44:21,585 64499216 DEBUG [neutron.plugins.nicira.api_client.request] Setting X-Nvp-Wait-For-Config-Generation request header: '425260708'
2015-01-13 06:44:21,694 64499216 DEBUG [neutron.plugins.nicira.api_client.request] [5672] Completed request 'GET https://xxx:443//ws.v1/lswitch/7d13735a-ca65-43b3-a3fb-ef8b1ca0f552/lport/5589905e-e531-4676-b0a0-40fe5589e60f?relations=LogicalPortStatus': 200 (0.11 seconds)
2015-01-13 06:44:21,694 64499216 DEBUG [neutron.plugins.nicira.api_client.client] [5672] Released connection https://xxx:443. 15 connection(s) available.
2015-01-13 06:44:21,694 64499216 DEBUG [neutron.plugins.nicira.api_client.request_eventlet] [5672] Completed request 'GET /ws.v1/lswitch/7d13735a-ca65-43b3-a3fb-ef8b1ca0f552/lport/5589905e-e531-4676-b0a0-40fe5589e60f?relations=LogicalPortStatus': 200
To manage notifications about this bug go to:
https://bugs.launchpad.net/vmware-nsx/+bug/1410805/+subscriptions
References