yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #91151
[Bug 2004004] [NEW] keepalived virtual_routes wrong order
Public bug reported:
Neutron version: 13.0.6 (I will try to test this with version Zed as well)
ML2: OVS
1. create a provider network (eg. named: public)
2. create a subnet pool (eg. 100.100.100.32/27)
3. create a 1st subnet /29 from that subnetpool on network "public": 100.100.100.32/29 --gateway 100.100.100.33
4. create a 2nd subnet /29 from that subnetpool on network "public": 100.100.100.40/29 --gateway 100.100.100.33
NOTE: the "physical" gateway of the whole subnetpool is 100.100.100.33
-> so the GW of the 2nd subnet is in the range of the 1st subnet!
neutron_l3_agent will create a keepalived.conf like:
global_defs {
notification_email_from neutron@openstack.local
router_id neutron
}
vrrp_script ha_health_check_186 {
script "/var/lib/neutron/ha_confs/f9ed7361-29b2-48e1-a96b-1a2919062021/ha_check_script_186.sh"
interval 5
fall 2
rise 2
}
vrrp_instance VR_186 {
state BACKUP
interface ha-f3350150-28
virtual_router_id 186
priority 50
garp_master_delay 60
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass somepass
}
track_interface {
ha-f3350150-28
}
virtual_ipaddress {
169.254.0.186/24 dev ha-f3350150-28
}
virtual_ipaddress_excluded {
192.168.199.1/24 dev qr-24c07a36-4f
100.100.100.34/32 dev qg-7b9963a7-72
100.100.100.42/32 dev qg-7b9963a7-72
100.100.100.43/29 dev qg-7b9963a7-72
fe80::xxxx:xxxx:xxxx:xxxx/64 dev qg-7b9963a7-72 scope link
fe80::xxxx:xxxx:xxxx:xxxx/64 dev qr-24c07a36-4f scope link
}
virtual_routes {
0.0.0.0/0 via 100.100.100.33 dev qg-7b9963a7-72
100.100.100.32/29 dev qg-7b9963a7-72 scope link
}
track_script {
ha_health_check_186
}
}
So keepalived will try to create the default route BEFORE the route "100.100.100.32/29 dev qg-7b9963a7-72 scope link" has been created. This will throw an error like:
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) Receive advertisement timeout
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) Entering MASTER STATE
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting VIPs.
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting E-VIPs.
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting Virtual Routes
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: Netlink: error: Network is unreachable(101), type=RTM_NEWROUTE(24), seq=1674751752, pid=0
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) Sending/queueing gratuitous ARPs on ha-f3350150-28 for 169.254.0.186
...
And the default route will be missing:
169.254.0.0/24 dev ha-a1f79365-fc proto kernel scope link src 169.254.0.186
169.254.192.0/18 dev ha-a1f79365-fc proto kernel scope link src 169.254.192.14
192.168.199.0/24 dev qr-24c07a36-4f proto kernel scope link src 192.168.199.1
100.100.100.32/29 dev qg-7b9963a7-72 scope link
100.100.100.40/29 dev qg-7b9963a7-72 proto kernel scope link src 100.100.100.43
Changing the order of "virtual_routes" will fix this issue:
}
virtual_routes {
100.100.100.32/29 dev qg-7b9963a7-72 scope link
0.0.0.0/0 via 100.100.100.33 dev qg-7b9963a7-72
}
The error message is now gone:
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) Receive advertisement timeout
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) Entering MASTER STATE
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting VIPs.
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting E-VIPs.
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting Virtual Routes
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) Sending/queueing gratuitous ARPs on ha-f3350150-28 for 169.254.0.186
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: Sending gratuitous ARP on ha-f3350150-28 for 169.254.0.186
And the routing table looks fine:
default via 100.100.100.33 dev qg-7b9963a7-72
169.254.0.0/24 dev ha-a1f79365-fc proto kernel scope link src 169.254.0.186
169.254.192.0/18 dev ha-a1f79365-fc proto kernel scope link src 169.254.192.14
192.168.199.0/24 dev qr-24c07a36-4f proto kernel scope link src 192.168.199.1
100.100.100.32/29 dev qg-7b9963a7-72 scope link
100.100.100.40/29 dev qg-7b9963a7-72 proto kernel scope link src 100.100.100.43
To me, it looks like the order in neutron/agent/linux/keepalived.py has
to be changed?
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2004004
Title:
keepalived virtual_routes wrong order
Status in neutron:
New
Bug description:
Neutron version: 13.0.6 (I will try to test this with version Zed as well)
ML2: OVS
1. create a provider network (eg. named: public)
2. create a subnet pool (eg. 100.100.100.32/27)
3. create a 1st subnet /29 from that subnetpool on network "public": 100.100.100.32/29 --gateway 100.100.100.33
4. create a 2nd subnet /29 from that subnetpool on network "public": 100.100.100.40/29 --gateway 100.100.100.33
NOTE: the "physical" gateway of the whole subnetpool is 100.100.100.33
-> so the GW of the 2nd subnet is in the range of the 1st subnet!
neutron_l3_agent will create a keepalived.conf like:
global_defs {
notification_email_from neutron@openstack.local
router_id neutron
}
vrrp_script ha_health_check_186 {
script "/var/lib/neutron/ha_confs/f9ed7361-29b2-48e1-a96b-1a2919062021/ha_check_script_186.sh"
interval 5
fall 2
rise 2
}
vrrp_instance VR_186 {
state BACKUP
interface ha-f3350150-28
virtual_router_id 186
priority 50
garp_master_delay 60
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass somepass
}
track_interface {
ha-f3350150-28
}
virtual_ipaddress {
169.254.0.186/24 dev ha-f3350150-28
}
virtual_ipaddress_excluded {
192.168.199.1/24 dev qr-24c07a36-4f
100.100.100.34/32 dev qg-7b9963a7-72
100.100.100.42/32 dev qg-7b9963a7-72
100.100.100.43/29 dev qg-7b9963a7-72
fe80::xxxx:xxxx:xxxx:xxxx/64 dev qg-7b9963a7-72 scope link
fe80::xxxx:xxxx:xxxx:xxxx/64 dev qr-24c07a36-4f scope link
}
virtual_routes {
0.0.0.0/0 via 100.100.100.33 dev qg-7b9963a7-72
100.100.100.32/29 dev qg-7b9963a7-72 scope link
}
track_script {
ha_health_check_186
}
}
So keepalived will try to create the default route BEFORE the route "100.100.100.32/29 dev qg-7b9963a7-72 scope link" has been created. This will throw an error like:
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) Receive advertisement timeout
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) Entering MASTER STATE
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting VIPs.
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting E-VIPs.
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting Virtual Routes
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: Netlink: error: Network is unreachable(101), type=RTM_NEWROUTE(24), seq=1674751752, pid=0
Jan 26 17:49:56 xxxxxxxx Keepalived_vrrp[1532]: (VR_186) Sending/queueing gratuitous ARPs on ha-f3350150-28 for 169.254.0.186
...
And the default route will be missing:
169.254.0.0/24 dev ha-a1f79365-fc proto kernel scope link src 169.254.0.186
169.254.192.0/18 dev ha-a1f79365-fc proto kernel scope link src 169.254.192.14
192.168.199.0/24 dev qr-24c07a36-4f proto kernel scope link src 192.168.199.1
100.100.100.32/29 dev qg-7b9963a7-72 scope link
100.100.100.40/29 dev qg-7b9963a7-72 proto kernel scope link src 100.100.100.43
Changing the order of "virtual_routes" will fix this issue:
}
virtual_routes {
100.100.100.32/29 dev qg-7b9963a7-72 scope link
0.0.0.0/0 via 100.100.100.33 dev qg-7b9963a7-72
}
The error message is now gone:
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) Receive advertisement timeout
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) Entering MASTER STATE
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting VIPs.
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting E-VIPs.
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) setting Virtual Routes
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: (VR_186) Sending/queueing gratuitous ARPs on ha-f3350150-28 for 169.254.0.186
Jan 27 10:09:31 xxxxxxxxx Keepalived_vrrp[1532]: Sending gratuitous ARP on ha-f3350150-28 for 169.254.0.186
And the routing table looks fine:
default via 100.100.100.33 dev qg-7b9963a7-72
169.254.0.0/24 dev ha-a1f79365-fc proto kernel scope link src 169.254.0.186
169.254.192.0/18 dev ha-a1f79365-fc proto kernel scope link src 169.254.192.14
192.168.199.0/24 dev qr-24c07a36-4f proto kernel scope link src 192.168.199.1
100.100.100.32/29 dev qg-7b9963a7-72 scope link
100.100.100.40/29 dev qg-7b9963a7-72 proto kernel scope link src 100.100.100.43
To me, it looks like the order in neutron/agent/linux/keepalived.py
has to be changed?
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2004004/+subscriptions
Follow ups