yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #73884
[Bug 1777968] Re: Too many DBDeadlockError and IP address collision during port creating
Reviewed: https://review.openstack.org/577739
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ec5cd0d1b85d56adef836bf1167287993ec35ca6
Submitter: Zuul
Branch: master
commit ec5cd0d1b85d56adef836bf1167287993ec35ca6
Author: LIU Yulong <i@xxxxxxxxxxxx>
Date: Sat Jun 23 04:52:15 2018 +0800
Reduce IP address collision during port creating
Try to give it a large random chance during generate IP
address.
The DB retry mechanism change moved to neutron-lib:
I5ad139bdfb3ae125658b36d05f85f139a1b47bee
Closes-Bug: #1777968
Change-Id: I2acc9c720b39271bde2a89da4a66958f7aba5b7d
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1777968
Title:
Too many DBDeadlockError and IP address collision during port creating
Status in neutron:
Fix Released
Bug description:
Too many DBDeadlockError and IP address collision during port creating
ENV:
Neutron stable/queens (12.0.1)
CentOS 7 (3.10.0-514.26.2.el7.x86_64)
This is a edge scenario testing after we meet bug:
https://bugs.launchpad.net/neutron/+bug/1777965
We have 3 neutron-server node, every each one enable 8 API worker.
Try create 1000 port in a single network 2.0.0.0/16.
Exception:
IP address collision:
2018-06-20 18:57:31.121 440352 ERROR oslo_db.api NeutronDbObjectDuplicateEntry: Failed to create a duplicate IpamAllocation: for attribute(s) ['PRIMARY'] with value(s) 2.0.8.33-c62094a1-1f21-42a0-bc12-a06db9661463
Deadlock:
2018-06-20 18:53:33.367 440348 ERROR neutron.pecan_wsgi.hooks.translation DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'INSERT INTO ipamallocations (ip_address, status, ipam_subnet_id) VALUES (%(ip_address)s, %(status)s, %(ipam_subnet_id)s)'] [parameters: {'status': u'ALLOCATED', 'ip_address': '2.0.7.138', 'ipam_subnet_id': 'c62094a1-1f21-42a0-bc12-a06db9661463'}] (Background on this error at: http://sqlalche.me/e/2j85)
LOG:
IP address collision 409:
http://paste.openstack.org/show/723983/
Deadlock 500:
http://paste.openstack.org/show/723986/
REQ AND RESP:
request:
2018-06-20 18:47:26.993 440352 DEBUG neutron.api.v2.base [req-fe6df326-a34e-4766-a280-24bf6e4d023f a187bcd0bd814d20aad5b3359868db3d 3ae80176a60f4d989c84c084ca05df1c - default default] Request body: {u'port': {u'network_id': u'ca054e1a-5646-42bd-9ab1-1fd5c6d26c92'}} prepare_request_body /usr/lib/python2.7/site-packages/neutron/api/v2/base.py:690
HTTP 409 response:
2018-06-20 18:57:31.217 440352 INFO neutron.wsgi [req-fe6df326-a34e-4766-a280-24bf6e4d023f a187bcd0bd814d20aad5b3359868db3d 3ae80176a60f4d989c84c084ca05df1c - default default] 10.129.169.147 "POST /v2.0/ports HTTP/1.1" status: 409 len: 0 time: 605.5961649
Exception count in each neutron server:
[root@147 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
3651
[root@148 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
3831
[root@149 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
3145
We found two key point of neutron code related to such error:
In [1], the IP range is a little small, and here only return the last one IP, not a random pick, then the IP address collision chance raise.
In [2], there are fixed value max_retries=10 and retry_interval=0.1. It seems not a good practice. And this wrapper just ingore the [database] config
attribute db_retry_interval, db_max_retry_interval and db_max_retries.
[1] https://github.com/openstack/neutron/blob/master/neutron/ipam/drivers/neutrondb_ipam/driver.py#L170-#L173
[2] https://github.com/openstack/neutron/blob/master/neutron/db/api.py#L71-#L76
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1777968/+subscriptions
References