yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #91164
[Bug 2004041] [NEW] Missing flows with ovs dvr after openvswitch restart
Public bug reported:
Certain flows are missing in a distributed openstack setup after restart of openvswitch.
I have tested this on openstack ussuri deployed with kolla-ansible on ubuntu bionic, so there is a chance that this has been either been fixed or is caused by specifics of the deployment.
## Steps to reproduce
There might be a simpler reproducer, but this is what I did:
* Setup a distributed openstack with at least one control node and two compute nodes
* Configure neutron with OVS and DVR
* Configure octavia with amphora driver
* Setup an external network as floating ip pool
* Create an instance with an http server
* Create a loadbalancer with an http listener/pool
* Add the instance as pool member to the loadbalancer
* Attach a floating IP to the loadbalancer's virtual IP
* Make sure that the loadbalancer amphora and the instance are on different compute nodes
* Ensure that you can make an http request, e.g.:
```
# curl -I http://${FLOATING_IP}
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
Date: Fri, 27 Jan 2023 15:00:00 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Fri, 27 Jan 2023 13:45:11 GMT
ETag: "63d3d567-264"
Accept-Ranges: bytes
0 612 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
```
* Restart openvswitch
```
# docker restart openvswitch_vswitchd
openvswitch_vswitchd
```
* Observe that the connection fails with, e.g.:
```
# curl -I http://${FLOATING_IP}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0
curl: (7) Failed to connect to ${FLOATING_IP} port 80: No route to host
```
* Connections will re-establish only after restarting neutron-
openvswitch-agent
## Flows before and after restart of openvswitch
Looking at the flows on the controller node on the tunnel bridge one can see, that flows are missing after restarting openvswitch:
```
# docker exec openvswitch_vswitchd ovs-ofctl dump-flows br-tun > before_ovs_restart.log
# docker restart openvswitch_vswitchd
openvswitch_vswitchd
# docker exec openvswitch_vswitchd ovs-ofctl dump-flows br-tun > after_ovs_restart.log
# awk '{print $3" "$(NF)}' < before_ovs_restart.log > before_ovs_restart_cleaned.log
# awk '{print $3" "$(NF)}' < after_ovs_restart.log > after_ovs_restart_cleaned.log
# diff before_ovs_restart_cleaned.log after_ovs_restart_cleaned.log
3,4d2
< table=0, actions=resubmit(,4)
< table=0, actions=resubmit(,4)
6,7d3
< table=1, actions=drop
< table=1, actions=mod_dl_src:fa:16:3f:56:bb:5a,resubmit(,2)
13d8
< table=4, actions=mod_vlan_vid:53,resubmit(,9)
20,22d14
< table=20, actions=strip_vlan,load:0x2ed->NXM_NX_TUN_ID[],output:22
< table=20, actions=strip_vlan,load:0x2ed->NXM_NX_TUN_ID[],output:23
< table=20, actions=load:0->NXM_OF_VLAN_TCI[],load:0x2ed->NXM_NX_TUN_ID[],output:22
24,25d15
< table=21, actions=load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163eb4cf96->NXM_NX_ARP_SHA[],load:0xa000165->NXM_OF_ARP_SPA[],move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:b4:cf:96,IN_PORT
< table=21, actions=load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e77e67e->NXM_NX_ARP_SHA[],load:0xa0000a3->NXM_OF_ARP_SPA[],move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:77:e6:7e,IN_PORT
27,28d16
< table=22, actions=drop
< table=22, actions=strip_vlan,load:0x2ed->NXM_NX_TUN_ID[],output:22,output:23
```
Please let me know if you need more information. I also have a heat
stack which automates the openstack resource part of the reproducer, in
case this makes things easier.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2004041
Title:
Missing flows with ovs dvr after openvswitch restart
Status in neutron:
New
Bug description:
Certain flows are missing in a distributed openstack setup after restart of openvswitch.
I have tested this on openstack ussuri deployed with kolla-ansible on ubuntu bionic, so there is a chance that this has been either been fixed or is caused by specifics of the deployment.
## Steps to reproduce
There might be a simpler reproducer, but this is what I did:
* Setup a distributed openstack with at least one control node and two compute nodes
* Configure neutron with OVS and DVR
* Configure octavia with amphora driver
* Setup an external network as floating ip pool
* Create an instance with an http server
* Create a loadbalancer with an http listener/pool
* Add the instance as pool member to the loadbalancer
* Attach a floating IP to the loadbalancer's virtual IP
* Make sure that the loadbalancer amphora and the instance are on different compute nodes
* Ensure that you can make an http request, e.g.:
```
# curl -I http://${FLOATING_IP}
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
Date: Fri, 27 Jan 2023 15:00:00 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Fri, 27 Jan 2023 13:45:11 GMT
ETag: "63d3d567-264"
Accept-Ranges: bytes
0 612 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
```
* Restart openvswitch
```
# docker restart openvswitch_vswitchd
openvswitch_vswitchd
```
* Observe that the connection fails with, e.g.:
```
# curl -I http://${FLOATING_IP}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0
curl: (7) Failed to connect to ${FLOATING_IP} port 80: No route to host
```
* Connections will re-establish only after restarting neutron-
openvswitch-agent
## Flows before and after restart of openvswitch
Looking at the flows on the controller node on the tunnel bridge one can see, that flows are missing after restarting openvswitch:
```
# docker exec openvswitch_vswitchd ovs-ofctl dump-flows br-tun > before_ovs_restart.log
# docker restart openvswitch_vswitchd
openvswitch_vswitchd
# docker exec openvswitch_vswitchd ovs-ofctl dump-flows br-tun > after_ovs_restart.log
# awk '{print $3" "$(NF)}' < before_ovs_restart.log > before_ovs_restart_cleaned.log
# awk '{print $3" "$(NF)}' < after_ovs_restart.log > after_ovs_restart_cleaned.log
# diff before_ovs_restart_cleaned.log after_ovs_restart_cleaned.log
3,4d2
< table=0, actions=resubmit(,4)
< table=0, actions=resubmit(,4)
6,7d3
< table=1, actions=drop
< table=1, actions=mod_dl_src:fa:16:3f:56:bb:5a,resubmit(,2)
13d8
< table=4, actions=mod_vlan_vid:53,resubmit(,9)
20,22d14
< table=20, actions=strip_vlan,load:0x2ed->NXM_NX_TUN_ID[],output:22
< table=20, actions=strip_vlan,load:0x2ed->NXM_NX_TUN_ID[],output:23
< table=20, actions=load:0->NXM_OF_VLAN_TCI[],load:0x2ed->NXM_NX_TUN_ID[],output:22
24,25d15
< table=21, actions=load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163eb4cf96->NXM_NX_ARP_SHA[],load:0xa000165->NXM_OF_ARP_SPA[],move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:b4:cf:96,IN_PORT
< table=21, actions=load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e77e67e->NXM_NX_ARP_SHA[],load:0xa0000a3->NXM_OF_ARP_SPA[],move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:77:e6:7e,IN_PORT
27,28d16
< table=22, actions=drop
< table=22, actions=strip_vlan,load:0x2ed->NXM_NX_TUN_ID[],output:22,output:23
```
Please let me know if you need more information. I also have a heat
stack which automates the openstack resource part of the reproducer,
in case this makes things easier.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2004041/+subscriptions
Follow ups