← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1833653] [NEW] We should cleanup ipv4 address if keepalived is dead

 

Public bug reported:

If a router's keepalived is dead(kill -9 <pid>, or kill -HUP <pid> too many times), and the original role in this node is master, this will cause brain split. Then when we restart neutron-l3-agent, the ipv6 will be cleanup, but the ipv4 still exists, I think we should alse cleanup ipv4 before enable keepalived.
The current state:
original master:
[root@node-1 ~]# ip netns exec qrouter-a8f9d4e4-622d-4ebf-b7a9-4818f63ef502 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
28: ha-66734f93-5e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:fd:49:a1 brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.1/18 brd 169.254.255.255 scope global ha-66734f93-5e
       valid_lft forever preferred_lft forever
    inet 169.254.0.1/24 scope global ha-66734f93-5e
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fefd:49a1/64 scope link
       valid_lft forever preferred_lft forever
29: qr-4f77a86c-c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:26:98:8f brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.1/24 scope global qr-4f77a86c-c2
       valid_lft forever preferred_lft forever
46: qg-e28d50e4-ba: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:c4:3e:43 brd ff:ff:ff:ff:ff:ff
    inet 172.16.10.133/24 scope global qg-e28d50e4-ba
       valid_lft forever preferred_lft forever
    inet 172.16.10.134/32 scope global qg-e28d50e4-ba
       valid_lft forever preferred_lft forever

currnet master:
[root@node-2 ~]# ip netns exec qrouter-a8f9d4e4-622d-4ebf-b7a9-4818f63ef502 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
32: ha-606f2f23-5f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:8e:dc:c4 brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.2/18 brd 169.254.255.255 scope global ha-606f2f23-5f
       valid_lft forever preferred_lft forever
    inet 169.254.0.1/24 scope global ha-606f2f23-5f
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe8e:dcc4/64 scope link
       valid_lft forever preferred_lft forever
33: qr-4f77a86c-c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:26:98:8f brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.1/24 scope global qr-4f77a86c-c2
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe26:988f/64 scope link nodad
       valid_lft forever preferred_lft forever
50: qg-e28d50e4-ba: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:c4:3e:43 brd ff:ff:ff:ff:ff:ff
    inet 172.16.10.133/24 scope global qg-e28d50e4-ba
       valid_lft forever preferred_lft forever
    inet 172.16.10.134/32 scope global qg-e28d50e4-ba
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fec4:3e43/64 scope link nodad
       valid_lft forever preferred_lft forever

The command line output showed that the original master's ipv6 were
cleanup.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1833653

Title:
  We should cleanup ipv4 address if keepalived is dead

Status in neutron:
  New

Bug description:
  If a router's keepalived is dead(kill -9 <pid>, or kill -HUP <pid> too many times), and the original role in this node is master, this will cause brain split. Then when we restart neutron-l3-agent, the ipv6 will be cleanup, but the ipv4 still exists, I think we should alse cleanup ipv4 before enable keepalived.
  The current state:
  original master:
  [root@node-1 ~]# ip netns exec qrouter-a8f9d4e4-622d-4ebf-b7a9-4818f63ef502 ip a
  1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      inet 127.0.0.1/8 scope host lo
         valid_lft forever preferred_lft forever
      inet6 ::1/128 scope host
         valid_lft forever preferred_lft forever
  28: ha-66734f93-5e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
      link/ether fa:16:3e:fd:49:a1 brd ff:ff:ff:ff:ff:ff
      inet 169.254.192.1/18 brd 169.254.255.255 scope global ha-66734f93-5e
         valid_lft forever preferred_lft forever
      inet 169.254.0.1/24 scope global ha-66734f93-5e
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:fefd:49a1/64 scope link
         valid_lft forever preferred_lft forever
  29: qr-4f77a86c-c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
      link/ether fa:16:3e:26:98:8f brd ff:ff:ff:ff:ff:ff
      inet 192.168.100.1/24 scope global qr-4f77a86c-c2
         valid_lft forever preferred_lft forever
  46: qg-e28d50e4-ba: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
      link/ether fa:16:3e:c4:3e:43 brd ff:ff:ff:ff:ff:ff
      inet 172.16.10.133/24 scope global qg-e28d50e4-ba
         valid_lft forever preferred_lft forever
      inet 172.16.10.134/32 scope global qg-e28d50e4-ba
         valid_lft forever preferred_lft forever

  currnet master:
  [root@node-2 ~]# ip netns exec qrouter-a8f9d4e4-622d-4ebf-b7a9-4818f63ef502 ip a
  1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      inet 127.0.0.1/8 scope host lo
         valid_lft forever preferred_lft forever
      inet6 ::1/128 scope host
         valid_lft forever preferred_lft forever
  32: ha-606f2f23-5f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
      link/ether fa:16:3e:8e:dc:c4 brd ff:ff:ff:ff:ff:ff
      inet 169.254.192.2/18 brd 169.254.255.255 scope global ha-606f2f23-5f
         valid_lft forever preferred_lft forever
      inet 169.254.0.1/24 scope global ha-606f2f23-5f
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:fe8e:dcc4/64 scope link
         valid_lft forever preferred_lft forever
  33: qr-4f77a86c-c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN qlen 1000
      link/ether fa:16:3e:26:98:8f brd ff:ff:ff:ff:ff:ff
      inet 192.168.100.1/24 scope global qr-4f77a86c-c2
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:fe26:988f/64 scope link nodad
         valid_lft forever preferred_lft forever
  50: qg-e28d50e4-ba: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
      link/ether fa:16:3e:c4:3e:43 brd ff:ff:ff:ff:ff:ff
      inet 172.16.10.133/24 scope global qg-e28d50e4-ba
         valid_lft forever preferred_lft forever
      inet 172.16.10.134/32 scope global qg-e28d50e4-ba
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:fec4:3e43/64 scope link nodad
         valid_lft forever preferred_lft forever

  The command line output showed that the original master's ipv6 were
  cleanup.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1833653/+subscriptions


Follow ups