← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1649963] [NEW] DNSMasq loses entry for instances

 

Public bug reported:

After booting a large number of new instances, instances randomly lose their ip address after running for a while. The instances were all reachable before. When I login via the web console and run `ip -4 a`, the vm does not have an ipv4 address.
The host file kept by dnsmasq in /var/lib/neutron/dhcp/<network-id>/host, shows that the entry for that VM is missing in the file.

There is an (ugly) workaround: stop neutron-dhcp-agent, kill the dnsmasq
process for the corresponding network-id, and start neutron-dhcp-agent
again, and the entry gets added to the host file and the instance
becomes reachable again.

It looks a lot like this bug: https://bugs.launchpad.net/neutron/+bug/1645509
Except in this bug, the reporter sees entries getting added to the file, and we're seeing entries getting removed from the file.

* Pre-conditions:
This is a production environment which hosts about 350-400 VMs. We are creating and deleting about 1000+ VMs per week. The issue we are seeing can affect anyone that uses the system, but it seems to happen more in networks with a lot of activity (a lot of newly created VMs).

* Step-by-step reproduction steps:
- Boot a large number of VMs (> 10) at the same time.
- SSH into the VMs and do your work.
- After a random amount of time the VM becomes unreachable

*Expected output:
VMs keep their ip address and stay reachable after booting.

* Actual output:
VMs are available at first, but eventually lose their ip address. The dnsmasq host file is missing the entry for that ip address.

* Version:
** Openstack Mitaka 9.0, deployed with Fuel.
** Ubuntu 14.04.5 LTS, running kernel 3.13.0-92-generic
** Neutron version 2:8.0.0-2~u14.04+mos48
** DNSMasq version 2.68-1ubuntu0.1

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1649963

Title:
  DNSMasq loses entry for instances

Status in neutron:
  New

Bug description:
  After booting a large number of new instances, instances randomly lose their ip address after running for a while. The instances were all reachable before. When I login via the web console and run `ip -4 a`, the vm does not have an ipv4 address.
  The host file kept by dnsmasq in /var/lib/neutron/dhcp/<network-id>/host, shows that the entry for that VM is missing in the file.

  There is an (ugly) workaround: stop neutron-dhcp-agent, kill the
  dnsmasq process for the corresponding network-id, and start neutron-
  dhcp-agent again, and the entry gets added to the host file and the
  instance becomes reachable again.

  It looks a lot like this bug: https://bugs.launchpad.net/neutron/+bug/1645509
  Except in this bug, the reporter sees entries getting added to the file, and we're seeing entries getting removed from the file.

  * Pre-conditions:
  This is a production environment which hosts about 350-400 VMs. We are creating and deleting about 1000+ VMs per week. The issue we are seeing can affect anyone that uses the system, but it seems to happen more in networks with a lot of activity (a lot of newly created VMs).

  * Step-by-step reproduction steps:
  - Boot a large number of VMs (> 10) at the same time.
  - SSH into the VMs and do your work.
  - After a random amount of time the VM becomes unreachable

  *Expected output:
  VMs keep their ip address and stay reachable after booting.

  * Actual output:
  VMs are available at first, but eventually lose their ip address. The dnsmasq host file is missing the entry for that ip address.

  * Version:
  ** Openstack Mitaka 9.0, deployed with Fuel.
  ** Ubuntu 14.04.5 LTS, running kernel 3.13.0-92-generic
  ** Neutron version 2:8.0.0-2~u14.04+mos48
  ** DNSMasq version 2.68-1ubuntu0.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1649963/+subscriptions