← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1777968] [NEW] Too many DBDeadlockError and IP address collision during port creating

 

Public bug reported:

Too many DBDeadlockError and IP address collision during port creating

ENV:
Neutron stable/queens (12.0.1)
CentOS 7 (3.10.0-514.26.2.el7.x86_64)

This is a edge scenario testing after we meet bug:
https://bugs.launchpad.net/neutron/+bug/1777965

We have 3 neutron-server node, every each one enable 8 API worker.

Try create 1000 port in a single network 2.0.0.0/16.


Exception:
IP address collision:
2018-06-20 18:57:31.121 440352 ERROR oslo_db.api NeutronDbObjectDuplicateEntry: Failed to create a duplicate IpamAllocation: for attribute(s) ['PRIMARY'] with value(s) 2.0.8.33-c62094a1-1f21-42a0-bc12-a06db9661463
Deadlock:
2018-06-20 18:53:33.367 440348 ERROR neutron.pecan_wsgi.hooks.translation       DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'INSERT INTO ipamallocations (ip_address, status, ipam_subnet_id) VALUES (%(ip_address)s, %(status)s, %(ipam_subnet_id)s)'] [parameters: {'status': u'ALLOCATED', 'ip_address': '2.0.7.138', 'ipam_subnet_id': 'c62094a1-1f21-42a0-bc12-a06db9661463'}] (Background on this error at: http://sqlalche.me/e/2j85)


LOG:
IP address collision 409:
http://paste.openstack.org/show/723983/

Deadlock 500:
http://paste.openstack.org/show/723986/

REQ AND RESP:
request:
2018-06-20 18:47:26.993 440352 DEBUG neutron.api.v2.base [req-fe6df326-a34e-4766-a280-24bf6e4d023f a187bcd0bd814d20aad5b3359868db3d 3ae80176a60f4d989c84c084ca05df1c - default default] Request body: {u'port': {u'network_id': u'ca054e1a-5646-42bd-9ab1-1fd5c6d26c92'}} prepare_request_body /usr/lib/python2.7/site-packages/neutron/api/v2/base.py:690


HTTP 409 response:
2018-06-20 18:57:31.217 440352 INFO neutron.wsgi [req-fe6df326-a34e-4766-a280-24bf6e4d023f a187bcd0bd814d20aad5b3359868db3d 3ae80176a60f4d989c84c084ca05df1c - default default] 10.129.169.147 "POST /v2.0/ports HTTP/1.1" status: 409  len: 0 time: 605.5961649


Exception count in each neutron server:
[root@147 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
3651
[root@148 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
3831
[root@149 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
3145


We found two key point of neutron code related to such error:
In [1], the IP range is a little small, and here only return the last one IP, not a random pick, then the IP address collision chance raise.
In [2], there are fixed value max_retries=10 and retry_interval=0.1. It seems not a good practice. And this wrapper just ingore the [database] config
attribute db_retry_interval, db_max_retry_interval and db_max_retries.

[1] https://github.com/openstack/neutron/blob/master/neutron/ipam/drivers/neutrondb_ipam/driver.py#L170-#L173
[2] https://github.com/openstack/neutron/blob/master/neutron/db/api.py#L71-#L76

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1777968

Title:
  Too many DBDeadlockError and IP address collision during port creating

Status in neutron:
  New

Bug description:
  Too many DBDeadlockError and IP address collision during port creating

  ENV:
  Neutron stable/queens (12.0.1)
  CentOS 7 (3.10.0-514.26.2.el7.x86_64)

  This is a edge scenario testing after we meet bug:
  https://bugs.launchpad.net/neutron/+bug/1777965

  We have 3 neutron-server node, every each one enable 8 API worker.

  Try create 1000 port in a single network 2.0.0.0/16.

  
  Exception:
  IP address collision:
  2018-06-20 18:57:31.121 440352 ERROR oslo_db.api NeutronDbObjectDuplicateEntry: Failed to create a duplicate IpamAllocation: for attribute(s) ['PRIMARY'] with value(s) 2.0.8.33-c62094a1-1f21-42a0-bc12-a06db9661463
  Deadlock:
  2018-06-20 18:53:33.367 440348 ERROR neutron.pecan_wsgi.hooks.translation       DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'INSERT INTO ipamallocations (ip_address, status, ipam_subnet_id) VALUES (%(ip_address)s, %(status)s, %(ipam_subnet_id)s)'] [parameters: {'status': u'ALLOCATED', 'ip_address': '2.0.7.138', 'ipam_subnet_id': 'c62094a1-1f21-42a0-bc12-a06db9661463'}] (Background on this error at: http://sqlalche.me/e/2j85)

  
  LOG:
  IP address collision 409:
  http://paste.openstack.org/show/723983/

  Deadlock 500:
  http://paste.openstack.org/show/723986/

  REQ AND RESP:
  request:
  2018-06-20 18:47:26.993 440352 DEBUG neutron.api.v2.base [req-fe6df326-a34e-4766-a280-24bf6e4d023f a187bcd0bd814d20aad5b3359868db3d 3ae80176a60f4d989c84c084ca05df1c - default default] Request body: {u'port': {u'network_id': u'ca054e1a-5646-42bd-9ab1-1fd5c6d26c92'}} prepare_request_body /usr/lib/python2.7/site-packages/neutron/api/v2/base.py:690

  
  HTTP 409 response:
  2018-06-20 18:57:31.217 440352 INFO neutron.wsgi [req-fe6df326-a34e-4766-a280-24bf6e4d023f a187bcd0bd814d20aad5b3359868db3d 3ae80176a60f4d989c84c084ca05df1c - default default] 10.129.169.147 "POST /v2.0/ports HTTP/1.1" status: 409  len: 0 time: 605.5961649

  
  Exception count in each neutron server:
  [root@147 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
  3651
  [root@148 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
  3831
  [root@149 neutron]# grep "duplicate IpamAllocation" server.log.bak4|wc -l
  3145

  
  We found two key point of neutron code related to such error:
  In [1], the IP range is a little small, and here only return the last one IP, not a random pick, then the IP address collision chance raise.
  In [2], there are fixed value max_retries=10 and retry_interval=0.1. It seems not a good practice. And this wrapper just ingore the [database] config
  attribute db_retry_interval, db_max_retry_interval and db_max_retries.

  [1] https://github.com/openstack/neutron/blob/master/neutron/ipam/drivers/neutrondb_ipam/driver.py#L170-#L173
  [2] https://github.com/openstack/neutron/blob/master/neutron/db/api.py#L71-#L76

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1777968/+subscriptions


Follow ups