yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #46711
[Bug 1548190] [NEW] dhcp-agent performance issue
Public bug reported:
Single controller[48 GB RAM, 16vCPU, 120GB Disk]
1 Network Node
100 ESX hypervisors distributed in 10 nova-compute nodes
100 KVM hypervisors
Test:
1. Create /16 network
2. Template to spawn 500 instances on network created in step 1
Observation:
1. Tried spinning 500 instances on KVM and 500 instances on ESX hypervisor using CirrOS qemu & vmdk image respectively
2. Stack creation went through fine, but randomly few VM's missed dhcp IP on first boot
3. After forcing sudo udhcpc from guest OS, VM's was able to receive IP
4. Tried tcpdump on network dhcp port on network node, I could able to see only dhcp request coming from guest VM but there was no dhcp offer or ack from dhcp agent.
Assume this issue is due to scale test on this setup.
Please let me know if you need any particular logs or details to debug
this issue, I can work and re-create issue and collect logs.
Logs:
Network node:
stack@network:/opt/stack/neutron$ git log -1
commit 171351d5fb55f96e9ed9e56eab755f2717ee67b0
Merge: 0a5a3cf 8f15d72
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Fri Feb 19 07:25:52 2016 +0000
Merge "Adopt Grafana to plot Neutron Failure Rates"
stack@network:/opt/stack/neutron$
stack@network:/opt/stack/neutron$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.3 LTS"
stack@network:/opt/stack/neutron$
stack@network:/opt/stack/neutron$ dpkg -l | grep dhcp
ii isc-dhcp-client 4.2.4-7ubuntu12.3 amd64 ISC DHCP client
ii isc-dhcp-common 4.2.4-7ubuntu12.3 amd64 common files used by all the isc-dhcp* packages
stack@network:/opt/stack/neutron$
nobody 32359 0.0 0.0 28212 2324 ? S Feb19 0:00 dnsmasq
--no-hosts --no-resolv --strict-order --except-interface=lo --pid-
file=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/pid --dhcp-
hostsfile=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/host --addn-
hosts=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/addn_hosts --dhcp-
optsfile=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/opts --dhcp-
leasefile=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/leases --dhcp-match=set:ipxe,175 --bind-interfaces
--interface=ns-460a72e9-8f --dhcp-
range=set:tag0,169.254.169.252,static,86400s --dhcp-lease-max=4 --conf-
file= --domain=openstacklocal
=== sshd host keys ===
-----BEGIN SSH HOST KEY KEYS-----
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgmNQGIVf2P+n6U3yYkWwv/V598atmu3ZCHN8XRHIgujUkKKJ5IaurmDFH9XqSJhB2B+90ZP/z0bTqkI0to7QuoFw0GzyV2og59cZZrjntimdoAdo7RZKuwtyQZoIOHILBE2On/y1F3ixhVeE5j78GuiDN+yCNiIBhAnm0V9d0QxGtqc= root@kvm-218-admin-3
ssh-dss AAAAB3NzaC1kc3MAAACBAJNHiG1dQxMG+vmlJEWY1xgak5xHGFzm+QFEkFXOaRzdzPw8y9DX/O+bxnrcaSKwKHBJcKaJZJ+SHnVEpX2QgQQA+uSSMCMKQFISjdTCtTQ/0tUiLculV/ISrXDCm9cHG6DWqplVoc/LmXJObh1NZrjPunW2NrUCoI2qXtFMbdiVAAAAFQC4GI/n1hxNcGEjkzJ2GRRcenU+hwAAAIAjJ09CspilFn7DintV3Sh7lCs1I5+m+mQcI/AJeLSE+Ee04Nw47KWgrFyoaCQNYyHt1wgONlTmcKYyDxUGwjbRTyZYEwdaxgd+klxt7NP6fR2kmuzCLiyFP+W2WNJcqRr6xxlRif4TOwbOd/Qlt0JeM1LufaxyuogSBgWS/kys6AAAAIAIPMlAPi41qCGhUnUkbx23OrzqkP+xZEj7cGnxa/HxpnIkZ+qMhRIhBMFADEt4MKKmdZyTPYDiSaiBR298ic6ywekA1YxAqwCEtKlA+7Cdl9q7REIPVh1CSWZdJEKi9qKBMYaMzuw7cbkdh5ZbOMHWALS8JgndYGD77xWQwKamCA== root@kvm-218-admin-3
-----END SSH HOST KEY KEYS-----
=== network info ===
if-info: lo,up,127.0.0.1,8,::1
if-info: eth0,up,,8,fe80::f816:3eff:feec:77dc
=== datasource: configdrive local ===
instance-id: 11a83228-beef-4c0e-a7a3-15748de381e9
name: kvm-218-admin-3
availability-zone: nova
local-hostname: kvm-218-admin-3.novalocal
launch-index: 0
=== cirros: current=0.3.4 uptime=191.83 ===
route: fscanf
=== pinging gateway failed, debugging connection ===
############ debug start ##############
### /etc/init.d/sshd start
Starting dropbear sshd: OK
route: fscanf
### ifconfig -a
eth0 Link encap:Ethernet HWaddr FA:16:3E:EC:77:DC
inet6 addr: fe80::f816:3eff:feec:77dc/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:84 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3840 (3.7 KiB) TX bytes:1150 (1.1 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1020 (1020.0 B) TX bytes:1020 (1020.0 B)
### route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
route: fscanf
### cat /etc/resolv.conf
cat: can't open '/etc/resolv.conf': No such file or directory
### gateway not found
/sbin/cirros-status: line 1: can't open /etc/resolv.conf: no such file
### pinging nameservers
### uname -a
Linux kvm-218-admin-3 3.2.0-80-virtual #116-Ubuntu SMP Mon Mar 23 17:28:52 UTC 2015 x86_64 GNU/Linux
### lsmod
Module Size Used by Not tainted
nls_iso8859_1 12713 0
nls_cp437 16991 0
vfat 17585 0
fat 61512 1 vfat
isofs 40259 0
ip_tables 27473 0
x_tables 29891 1 ip_tables
pcnet32 42119 0
8139cp 27360 0
ne2k_pci 13691 0
8390 18856 1 ne2k_pci
e1000 108589 0
acpiphp 24231 0
### dmesg | tail
[ 2.581084] EXT3-fs (vda): mounted filesystem with ordered data mode
[ 2.746639] ISO 9660 Extensions: Microsoft Joliet Level 3
[ 2.790563] ISO 9660 Extensions: RRIP_1991A
[ 2.955492] kjournald starting. Commit interval 5 seconds
[ 2.973249] EXT3-fs (vda): using internal journal
[ 2.973330] EXT3-fs (vda): mounted filesystem with ordered data mode
[ 5.386057] EXT3-fs (vda): using internal journal
[ 7.067755] ISO 9660 Extensions: Microsoft Joliet Level 3
[ 7.151524] ISO 9660 Extensions: RRIP_1991A
[ 19.008702] eth0: no IPv6 routers present
### tail -n 25 /var/log/messages
Feb 3 11:31:14 cirros kern.info kernel: [ 2.267494] acpiphp: Slot [25] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.267701] acpiphp: Slot [26] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.267899] acpiphp: Slot [27] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268298] acpiphp: Slot [28] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268528] acpiphp: Slot [29] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268721] acpiphp: Slot [30] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268907] acpiphp: Slot [31] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.302428] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
Feb 3 11:31:14 cirros kern.info kernel: [ 2.302460] e1000: Copyright (c) 1999-2006 Intel Corporation.
Feb 3 11:31:14 cirros kern.info kernel: [ 2.325543] ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
Feb 3 11:31:14 cirros kern.info kernel: [ 2.348426] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
Feb 3 11:31:14 cirros kern.info kernel: [ 2.375341] pcnet32: pcnet32.c:v1.35 21.Apr.2008 tsbogend@xxxxxxxxxxxxxxxx
Feb 3 11:31:14 cirros kern.info kernel: [ 2.400792] ip_tables: (C) 2000-2006 Netfilter Core Team
Feb 3 11:31:14 cirros kern.info kernel: [ 2.579114] kjournald starting. Commit interval 5 seconds
Feb 3 11:31:14 cirros kern.info kernel: [ 2.581084] EXT3-fs (vda): mounted filesystem with ordered data mode
Feb 3 11:31:14 cirros kern.debug kernel: [ 2.746639] ISO 9660 Extensions: Microsoft Joliet Level 3
Feb 3 11:31:14 cirros kern.debug kernel: [ 2.790563] ISO 9660 Extensions: RRIP_1991A
Feb 3 11:31:14 cirros kern.info kernel: [ 2.955492] kjournald starting. Commit interval 5 seconds
Feb 3 11:31:14 cirros kern.info kernel: [ 2.973249] EXT3-fs (vda): using internal journal
Feb 3 11:31:14 cirros kern.info kernel: [ 2.973330] EXT3-fs (vda): mounted filesystem with ordered data mode
Feb 3 11:31:14 cirros kern.info kernel: [ 5.386057] EXT3-fs (vda): using internal journal
Feb 3 11:31:15 cirros kern.debug kernel: [ 7.067755] ISO 9660 Extensions: Microsoft Joliet Level 3
Feb 3 11:31:15 cirros kern.debug kernel: [ 7.151524] ISO 9660 Extensions: RRIP_1991A
Feb 3 11:31:27 cirros kern.debug kernel: [ 19.008702] eth0: no IPv6 routers present
Feb 3 11:34:19 cirros authpriv.info dropbear[261]: Running in background
############ debug end ##############
____ ____ ____
/ __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/ /_/ \____/___/
http://cirros-cloud.net
login as 'cirros' user. default password: 'cubswin:)'. use 'sudo' for root.
kvm-218-admin-3 login:
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1548190
Title:
dhcp-agent performance issue
Status in neutron:
New
Bug description:
Single controller[48 GB RAM, 16vCPU, 120GB Disk]
1 Network Node
100 ESX hypervisors distributed in 10 nova-compute nodes
100 KVM hypervisors
Test:
1. Create /16 network
2. Template to spawn 500 instances on network created in step 1
Observation:
1. Tried spinning 500 instances on KVM and 500 instances on ESX hypervisor using CirrOS qemu & vmdk image respectively
2. Stack creation went through fine, but randomly few VM's missed dhcp IP on first boot
3. After forcing sudo udhcpc from guest OS, VM's was able to receive IP
4. Tried tcpdump on network dhcp port on network node, I could able to see only dhcp request coming from guest VM but there was no dhcp offer or ack from dhcp agent.
Assume this issue is due to scale test on this setup.
Please let me know if you need any particular logs or details to debug
this issue, I can work and re-create issue and collect logs.
Logs:
Network node:
stack@network:/opt/stack/neutron$ git log -1
commit 171351d5fb55f96e9ed9e56eab755f2717ee67b0
Merge: 0a5a3cf 8f15d72
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Fri Feb 19 07:25:52 2016 +0000
Merge "Adopt Grafana to plot Neutron Failure Rates"
stack@network:/opt/stack/neutron$
stack@network:/opt/stack/neutron$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.3 LTS"
stack@network:/opt/stack/neutron$
stack@network:/opt/stack/neutron$ dpkg -l | grep dhcp
ii isc-dhcp-client 4.2.4-7ubuntu12.3 amd64 ISC DHCP client
ii isc-dhcp-common 4.2.4-7ubuntu12.3 amd64 common files used by all the isc-dhcp* packages
stack@network:/opt/stack/neutron$
nobody 32359 0.0 0.0 28212 2324 ? S Feb19 0:00
dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo
--pid-file=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/pid --dhcp-
hostsfile=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/host --addn-
hosts=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/addn_hosts --dhcp-
optsfile=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/opts --dhcp-
leasefile=/opt/stack/data/neutron/dhcp/dba9d787-d145-4b2b-8c5d-
d1b3375935ba/leases --dhcp-match=set:ipxe,175 --bind-interfaces
--interface=ns-460a72e9-8f --dhcp-
range=set:tag0,169.254.169.252,static,86400s --dhcp-lease-max=4
--conf-file= --domain=openstacklocal
=== sshd host keys ===
-----BEGIN SSH HOST KEY KEYS-----
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgmNQGIVf2P+n6U3yYkWwv/V598atmu3ZCHN8XRHIgujUkKKJ5IaurmDFH9XqSJhB2B+90ZP/z0bTqkI0to7QuoFw0GzyV2og59cZZrjntimdoAdo7RZKuwtyQZoIOHILBE2On/y1F3ixhVeE5j78GuiDN+yCNiIBhAnm0V9d0QxGtqc= root@kvm-218-admin-3
ssh-dss AAAAB3NzaC1kc3MAAACBAJNHiG1dQxMG+vmlJEWY1xgak5xHGFzm+QFEkFXOaRzdzPw8y9DX/O+bxnrcaSKwKHBJcKaJZJ+SHnVEpX2QgQQA+uSSMCMKQFISjdTCtTQ/0tUiLculV/ISrXDCm9cHG6DWqplVoc/LmXJObh1NZrjPunW2NrUCoI2qXtFMbdiVAAAAFQC4GI/n1hxNcGEjkzJ2GRRcenU+hwAAAIAjJ09CspilFn7DintV3Sh7lCs1I5+m+mQcI/AJeLSE+Ee04Nw47KWgrFyoaCQNYyHt1wgONlTmcKYyDxUGwjbRTyZYEwdaxgd+klxt7NP6fR2kmuzCLiyFP+W2WNJcqRr6xxlRif4TOwbOd/Qlt0JeM1LufaxyuogSBgWS/kys6AAAAIAIPMlAPi41qCGhUnUkbx23OrzqkP+xZEj7cGnxa/HxpnIkZ+qMhRIhBMFADEt4MKKmdZyTPYDiSaiBR298ic6ywekA1YxAqwCEtKlA+7Cdl9q7REIPVh1CSWZdJEKi9qKBMYaMzuw7cbkdh5ZbOMHWALS8JgndYGD77xWQwKamCA== root@kvm-218-admin-3
-----END SSH HOST KEY KEYS-----
=== network info ===
if-info: lo,up,127.0.0.1,8,::1
if-info: eth0,up,,8,fe80::f816:3eff:feec:77dc
=== datasource: configdrive local ===
instance-id: 11a83228-beef-4c0e-a7a3-15748de381e9
name: kvm-218-admin-3
availability-zone: nova
local-hostname: kvm-218-admin-3.novalocal
launch-index: 0
=== cirros: current=0.3.4 uptime=191.83 ===
route: fscanf
=== pinging gateway failed, debugging connection ===
############ debug start ##############
### /etc/init.d/sshd start
Starting dropbear sshd: OK
route: fscanf
### ifconfig -a
eth0 Link encap:Ethernet HWaddr FA:16:3E:EC:77:DC
inet6 addr: fe80::f816:3eff:feec:77dc/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:84 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3840 (3.7 KiB) TX bytes:1150 (1.1 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1020 (1020.0 B) TX bytes:1020 (1020.0 B)
### route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
route: fscanf
### cat /etc/resolv.conf
cat: can't open '/etc/resolv.conf': No such file or directory
### gateway not found
/sbin/cirros-status: line 1: can't open /etc/resolv.conf: no such file
### pinging nameservers
### uname -a
Linux kvm-218-admin-3 3.2.0-80-virtual #116-Ubuntu SMP Mon Mar 23 17:28:52 UTC 2015 x86_64 GNU/Linux
### lsmod
Module Size Used by Not tainted
nls_iso8859_1 12713 0
nls_cp437 16991 0
vfat 17585 0
fat 61512 1 vfat
isofs 40259 0
ip_tables 27473 0
x_tables 29891 1 ip_tables
pcnet32 42119 0
8139cp 27360 0
ne2k_pci 13691 0
8390 18856 1 ne2k_pci
e1000 108589 0
acpiphp 24231 0
### dmesg | tail
[ 2.581084] EXT3-fs (vda): mounted filesystem with ordered data mode
[ 2.746639] ISO 9660 Extensions: Microsoft Joliet Level 3
[ 2.790563] ISO 9660 Extensions: RRIP_1991A
[ 2.955492] kjournald starting. Commit interval 5 seconds
[ 2.973249] EXT3-fs (vda): using internal journal
[ 2.973330] EXT3-fs (vda): mounted filesystem with ordered data mode
[ 5.386057] EXT3-fs (vda): using internal journal
[ 7.067755] ISO 9660 Extensions: Microsoft Joliet Level 3
[ 7.151524] ISO 9660 Extensions: RRIP_1991A
[ 19.008702] eth0: no IPv6 routers present
### tail -n 25 /var/log/messages
Feb 3 11:31:14 cirros kern.info kernel: [ 2.267494] acpiphp: Slot [25] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.267701] acpiphp: Slot [26] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.267899] acpiphp: Slot [27] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268298] acpiphp: Slot [28] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268528] acpiphp: Slot [29] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268721] acpiphp: Slot [30] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.268907] acpiphp: Slot [31] registered
Feb 3 11:31:14 cirros kern.info kernel: [ 2.302428] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
Feb 3 11:31:14 cirros kern.info kernel: [ 2.302460] e1000: Copyright (c) 1999-2006 Intel Corporation.
Feb 3 11:31:14 cirros kern.info kernel: [ 2.325543] ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
Feb 3 11:31:14 cirros kern.info kernel: [ 2.348426] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
Feb 3 11:31:14 cirros kern.info kernel: [ 2.375341] pcnet32: pcnet32.c:v1.35 21.Apr.2008 tsbogend@xxxxxxxxxxxxxxxx
Feb 3 11:31:14 cirros kern.info kernel: [ 2.400792] ip_tables: (C) 2000-2006 Netfilter Core Team
Feb 3 11:31:14 cirros kern.info kernel: [ 2.579114] kjournald starting. Commit interval 5 seconds
Feb 3 11:31:14 cirros kern.info kernel: [ 2.581084] EXT3-fs (vda): mounted filesystem with ordered data mode
Feb 3 11:31:14 cirros kern.debug kernel: [ 2.746639] ISO 9660 Extensions: Microsoft Joliet Level 3
Feb 3 11:31:14 cirros kern.debug kernel: [ 2.790563] ISO 9660 Extensions: RRIP_1991A
Feb 3 11:31:14 cirros kern.info kernel: [ 2.955492] kjournald starting. Commit interval 5 seconds
Feb 3 11:31:14 cirros kern.info kernel: [ 2.973249] EXT3-fs (vda): using internal journal
Feb 3 11:31:14 cirros kern.info kernel: [ 2.973330] EXT3-fs (vda): mounted filesystem with ordered data mode
Feb 3 11:31:14 cirros kern.info kernel: [ 5.386057] EXT3-fs (vda): using internal journal
Feb 3 11:31:15 cirros kern.debug kernel: [ 7.067755] ISO 9660 Extensions: Microsoft Joliet Level 3
Feb 3 11:31:15 cirros kern.debug kernel: [ 7.151524] ISO 9660 Extensions: RRIP_1991A
Feb 3 11:31:27 cirros kern.debug kernel: [ 19.008702] eth0: no IPv6 routers present
Feb 3 11:34:19 cirros authpriv.info dropbear[261]: Running in background
############ debug end ##############
____ ____ ____
/ __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/ /_/ \____/___/
http://cirros-cloud.net
login as 'cirros' user. default password: 'cubswin:)'. use 'sudo' for root.
kvm-218-admin-3 login:
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1548190/+subscriptions
Follow ups