yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #43959
[Bug 1531013] [NEW] Duplicate entries in FDB table
Public bug reported:
Posting here, because I'm not sure of a better place at the moment.
Environment: Juno
OS: Ubuntu 14.04 LTS
Plugin: ML2/LinuxBridge
root@infra01_neutron_agents_container-4c850328:~# bridge -V
bridge utility, 0.0
root@infra01_neutron_agents_container-4c850328:~# ip -V
ip utility, iproute2-ss131122
root@infra01_neutron_agents_container-4c850328:~# uname -a
Linux infra01_neutron_agents_container-4c850328 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
We recently discovered that across the environment (5 controller, 50+
compute) there are (tens of) thousands of duplicate entries in the FDB
table, but only for the 00:00:00:00:00:00 broadcast entries. This is in
an environment of ~1600 instances, ~4,100 ports, and 80 networks.
In this example, the number of duplicate FDB entries for this particular
VTEP jumps wildly:
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
1429
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
81057
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
25806
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
473141
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
225472
That behavior can be observed for all other VTEPs. We're seeing over 13
million total FDB entries on this node:
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show >> james_fdb2.txt
root@infra01_neutron_agents_container-4c850328:~# cat james_fdb2.txt | wc -l
13554258
We're also seeing the wild counts on compute nodes. These were run
within 1 second of the previous completion:
root@compute032:~# bridge fdb show | wc -l
898981
root@compute032:~# bridge fdb show | wc -l
734916
root@compute032:~# bridge fdb show | wc -l
1483081
root@compute032:~# bridge fdb show | wc -l
508811
root@compute032:~# bridge fdb show | wc -l
2349221
On this node, you can see over 28,000 duplicates for each of the
entries:
root@compute032:~# bridge fdb show | sort | uniq -c | sort -nr
28871 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.39 self permanent
28871 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.38 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.252 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.157 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.133 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.242.66 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.242.193 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.60 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.59 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.58 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.57 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.55 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.54 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.53 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.51 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.50 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.49 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.48 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.47 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.46 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.45 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.44 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.43 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.42 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.40 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.37 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.36 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.35 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.34 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.33 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.32 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.31 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.30 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.29 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.28 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.27 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.26 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.25 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.24 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.23 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.22 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.21 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.137 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.132 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.131 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.130 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.129 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.128 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.127 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.107 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.106 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.105 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.104 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.103 self permanent
28869 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.136 self permanent
All other entries for other VXLAN networks on this node have 2
duplicates per VTEP, but it varies wildly across the environment.
Using the 'bridge monitor fdb' command, I am unable to see this behavior
in action. Nor is there anything wild in the syslog other than messages
like this:
2016-01-04T22:52:02.040435+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343454.013037] vxlan: non-ECT from 172.29.240.39 with TOS=0x2
2016-01-04T22:52:12.120434+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343464.105158] vxlan: non-ECT from 172.29.240.126 with TOS=0x2
2016-01-04T22:52:12.200251+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343464.185067] vxlan: non-ECT from 172.29.240.104 with TOS=0x2
2016-01-04T22:52:32.295703+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343484.298660] net_ratelimit: 689 callbacks suppressed
2016-01-04T22:52:32.355418+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343484.359395] vxlan: non-ECT from 172.29.240.133 with TOS=0x2
2016-01-04T22:52:37.352455+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343489.358137] vxlan: non-ECT from 172.29.240.60 with TOS=0x2
2016-01-04T22:52:37.494525+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343489.503365] vxlan: non-ECT from 172.29.240.125 with TOS=0x2
2016-01-04T22:52:37.526831+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343489.535736] vxlan: non-ECT from 172.29.240.127 with TOS=0x2
If additional info is needed please let me know.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1531013
Title:
Duplicate entries in FDB table
Status in neutron:
New
Bug description:
Posting here, because I'm not sure of a better place at the moment.
Environment: Juno
OS: Ubuntu 14.04 LTS
Plugin: ML2/LinuxBridge
root@infra01_neutron_agents_container-4c850328:~# bridge -V
bridge utility, 0.0
root@infra01_neutron_agents_container-4c850328:~# ip -V
ip utility, iproute2-ss131122
root@infra01_neutron_agents_container-4c850328:~# uname -a
Linux infra01_neutron_agents_container-4c850328 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
We recently discovered that across the environment (5 controller, 50+
compute) there are (tens of) thousands of duplicate entries in the FDB
table, but only for the 00:00:00:00:00:00 broadcast entries. This is
in an environment of ~1600 instances, ~4,100 ports, and 80 networks.
In this example, the number of duplicate FDB entries for this
particular VTEP jumps wildly:
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
1429
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
81057
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
25806
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
473141
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show | grep "00:00:00:00:00:00 dev vxlan-10 dst 172.29.243.157" | wc -l
225472
That behavior can be observed for all other VTEPs. We're seeing over
13 million total FDB entries on this node:
root@infra01_neutron_agents_container-4c850328:~# bridge fdb show >> james_fdb2.txt
root@infra01_neutron_agents_container-4c850328:~# cat james_fdb2.txt | wc -l
13554258
We're also seeing the wild counts on compute nodes. These were run
within 1 second of the previous completion:
root@compute032:~# bridge fdb show | wc -l
898981
root@compute032:~# bridge fdb show | wc -l
734916
root@compute032:~# bridge fdb show | wc -l
1483081
root@compute032:~# bridge fdb show | wc -l
508811
root@compute032:~# bridge fdb show | wc -l
2349221
On this node, you can see over 28,000 duplicates for each of the
entries:
root@compute032:~# bridge fdb show | sort | uniq -c | sort -nr
28871 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.39 self permanent
28871 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.38 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.252 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.157 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.133 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.242.66 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.242.193 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.60 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.59 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.58 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.57 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.55 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.54 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.53 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.51 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.50 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.49 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.48 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.47 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.46 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.45 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.44 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.43 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.42 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.40 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.37 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.36 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.35 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.34 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.33 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.32 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.31 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.30 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.29 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.28 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.27 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.26 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.25 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.24 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.23 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.22 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.21 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.137 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.132 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.131 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.130 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.129 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.128 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.127 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.107 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.106 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.105 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.104 self permanent
28870 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.103 self permanent
28869 00:00:00:00:00:00 dev vxlan-15 dst 172.29.240.136 self permanent
All other entries for other VXLAN networks on this node have 2
duplicates per VTEP, but it varies wildly across the environment.
Using the 'bridge monitor fdb' command, I am unable to see this
behavior in action. Nor is there anything wild in the syslog other
than messages like this:
2016-01-04T22:52:02.040435+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343454.013037] vxlan: non-ECT from 172.29.240.39 with TOS=0x2
2016-01-04T22:52:12.120434+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343464.105158] vxlan: non-ECT from 172.29.240.126 with TOS=0x2
2016-01-04T22:52:12.200251+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343464.185067] vxlan: non-ECT from 172.29.240.104 with TOS=0x2
2016-01-04T22:52:32.295703+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343484.298660] net_ratelimit: 689 callbacks suppressed
2016-01-04T22:52:32.355418+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343484.359395] vxlan: non-ECT from 172.29.240.133 with TOS=0x2
2016-01-04T22:52:37.352455+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343489.358137] vxlan: non-ECT from 172.29.240.60 with TOS=0x2
2016-01-04T22:52:37.494525+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343489.503365] vxlan: non-ECT from 172.29.240.125 with TOS=0x2
2016-01-04T22:52:37.526831+00:00 infra01_neutron_agents_container-4c850328 kernel: [25343489.535736] vxlan: non-ECT from 172.29.240.127 with TOS=0x2
If additional info is needed please let me know.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1531013/+subscriptions
Follow ups