← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1959697] [NEW] VM gets wrong ipv6 address from dhcp-agent after ipv6 address on port was changed

 

Public bug reported:

I run into a problem when neutron dhcp-agent is still replying to the old address confirmation.
Simple steps to reproduce:
- create a port with IPv6 address in dhcpv6-stateful subnet
- create a VM with cloud-init inside
- change the IPv6 port address
- reboot the VM

Here are my commands:

$ openstack subnet create --subnet-range 2001:db8:123::/64 --ip-version 6 --ipv6-address-mode dhcpv6-stateful --network public subv6
$ openstack subnet list --network public
+--------------------------------------+-------+--------------------------------------+-------------------+
| ID                                   | Name  | Network                              | Subnet            |
+--------------------------------------+-------+--------------------------------------+-------------------+
| 6d9a7fb5-5c1b-4759-b32b-5720b5cedbf4 | subv4 | f1f3d967-26db-41b3-b6f6-1d5356e33a84 | 10.136.16.0/22    |
| 76db898c-6a7a-4301-9253-23241cafaa83 | subv6 | f1f3d967-26db-41b3-b6f6-1d5356e33a84 | 2001:db8:123::/64 |
+--------------------------------------+-------+--------------------------------------+-------------------+
$

$ openstack port create my-port  --network public --fixed-ip ip-address=10.136.17.163 --fixed-ip ip-address=2001:db8:123::111
$ openstack server create test --flavor m1.small --port my-port --image CentOS-7-x86_64-GenericCloud-2009.qcow2 --key-name key --use-config-drive

Check IPv6 address inside VM (it's correct):

[centos@test ~]$ ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
    inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
       valid_lft 86371sec preferred_lft 86371sec
    inet6 2001:db8:123::111/128 scope global dynamic
       valid_lft 7473sec preferred_lft 7173sec
    inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
       valid_lft forever preferred_lft forever
[centos@test ~]$

Change IPv6 address and reboot the VM:
$ openstack port set my-port --no-fixed-ip --fixed-ip ip-address=10.136.17.163 --fixed-ip ip-address=2001:db8:123::222
$ openstack server reboot test

[centos@test ~]$ ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
    inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
       valid_lft 86382sec preferred_lft 86382sec
    inet6 2001:db8:123::111/128 scope global dynamic
       valid_lft 7482sec preferred_lft 7182sec
    inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
       valid_lft forever preferred_lft forever
[centos@test ~]$

^^ you can see the VM got the old IPv6 address and actually all traffic
is blocked by port-security feature. If I remove a lease file and re-
spawn a dhclient, all is fine:

[centos@test ~]$ ps axf | grep dhcl
  780 ?        Ss     0:00 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--eth0.lease -pf /var/run/dhclient-eth0.pid -H test eth0
  868 ?        Ss     0:00 /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
 1371 pts/0    S+     0:00              \_ grep --color=auto dhcl
[centos@test ~]$ sudo kill -9 868
[centos@test ~]$ sudo ip addr del 2001:db8:123::111/128 dev eth0
[centos@test ~]$ sudo rm -rf /var/lib/dhclient/dhclient6--eth0.lease
[centos@test ~]$ sudo /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
[centos@test ~]$ ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
    inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
       valid_lft 86319sec preferred_lft 86319sec
    inet6 2001:db8:123::222/128 scope global dynamic
       valid_lft 7481sec preferred_lft 7181sec
    inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
       valid_lft forever preferred_lft forever
[centos@test ~]$

I found some logic with dhcpv6 leases removing here:
https://opendev.org/openstack/neutron/src/commit/e7b70521d0e230143a80974e7e4795a2acafcc9b/neutron/agent/linux/dhcp.py#L600
but it looks like it doesn't help in case of DHCPCONFIRM client request:
In the dnsmasq logs I see the following DHCPCONFIRM->DHCPREPLY messages exchange after the VM came back after the reboot (see also https://datatracker.ietf.org/doc/html/rfc3315#page-50):

Feb  1 16:49:12 dnsmasq-dhcp[1360521]: DHCPREQUEST(tapc233cb5c-8f) 10.136.17.163 fa:16:3e:2e:66:ac
Feb  1 16:49:12 dnsmasq-dhcp[1360521]: DHCPACK(tapc233cb5c-8f) 10.136.17.163 fa:16:3e:2e:66:ac host-10-136-17-163
Feb  1 16:49:15 dnsmasq-dhcp[1360521]: DHCPCONFIRM(tapc233cb5c-8f) 00:01:00:01:29:8c:20:5e:fa:16:3e:2e:66:ac
Feb  1 16:49:15 dnsmasq-dhcp[1360521]: DHCPREPLY(tapc233cb5c-8f) 2001:db8:123::111 00:01:00:01:29:8c:20:5e:fa:16:3e:2e:66:ac host-2001-db8-123--222

** Affects: neutron
     Importance: Undecided
         Status: New

** Description changed:

  I run into a problem when neutron dhcp-agent is still replying to the old address confirmation.
  Simple steps to reproduce:
  - create a port with IPv6 address in dhcpv6-stateful subnet
  - create a VM with cloud-init inside
  - change the IPv6 port address
  - reboot the VM
  
  Here are my commands:
  
  $ openstack subnet create --subnet-range 2001:db8:123::/64 --ip-version 6 --ipv6-address-mode dhcpv6-stateful --network public subv6
  $ openstack subnet list --network public
  +--------------------------------------+-------+--------------------------------------+-------------------+
  | ID                                   | Name  | Network                              | Subnet            |
  +--------------------------------------+-------+--------------------------------------+-------------------+
  | 6d9a7fb5-5c1b-4759-b32b-5720b5cedbf4 | subv4 | f1f3d967-26db-41b3-b6f6-1d5356e33a84 | 10.136.16.0/22    |
  | 76db898c-6a7a-4301-9253-23241cafaa83 | subv6 | f1f3d967-26db-41b3-b6f6-1d5356e33a84 | 2001:db8:123::/64 |
  +--------------------------------------+-------+--------------------------------------+-------------------+
  $
  
  $ openstack port create my-port  --network public --fixed-ip ip-address=10.136.17.163 --fixed-ip ip-address=2001:db8:123::111
  $ openstack server create test --flavor m1.small --port my-port --image CentOS-7-x86_64-GenericCloud-2009.qcow2 --key-name key --use-config-drive
  
  Check IPv6 address inside VM (it's correct):
  
  [centos@test ~]$ ip a s eth0
  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
-     link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
-     inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
-        valid_lft 86371sec preferred_lft 86371sec
-     inet6 2001:db8:123::111/128 scope global dynamic
-        valid_lft 7473sec preferred_lft 7173sec
-     inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
-        valid_lft forever preferred_lft forever
+     link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
+     inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
+        valid_lft 86371sec preferred_lft 86371sec
+     inet6 2001:db8:123::111/128 scope global dynamic
+        valid_lft 7473sec preferred_lft 7173sec
+     inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
+        valid_lft forever preferred_lft forever
  [centos@test ~]$
  
  Change IPv6 address and reboot the VM:
  $ openstack port set my-port --no-fixed-ip --fixed-ip ip-address=10.136.17.163 --fixed-ip ip-address=2001:db8:123::222
  $ openstack server reboot test
  
  [centos@test ~]$ ip a s eth0
  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
-     link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
-     inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
-        valid_lft 86382sec preferred_lft 86382sec
-     inet6 2001:db8:123::111/128 scope global dynamic
-        valid_lft 7482sec preferred_lft 7182sec
-     inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
-        valid_lft forever preferred_lft forever
+     link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
+     inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
+        valid_lft 86382sec preferred_lft 86382sec
+     inet6 2001:db8:123::111/128 scope global dynamic
+        valid_lft 7482sec preferred_lft 7182sec
+     inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
+        valid_lft forever preferred_lft forever
  [centos@test ~]$
  
  ^^ you can see the VM got the old IPv6 address and actually all traffic
  is blocked by port-security feature. If I remove a lease file and re-
  spawn a dhclient, all is fine:
  
  [centos@test ~]$ ps axf | grep dhcl
-   780 ?        Ss     0:00 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--eth0.lease -pf /var/run/dhclient-eth0.pid -H test eth0
-   868 ?        Ss     0:00 /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
-  1371 pts/0    S+     0:00              \_ grep --color=auto dhcl
+   780 ?        Ss     0:00 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--eth0.lease -pf /var/run/dhclient-eth0.pid -H test eth0
+   868 ?        Ss     0:00 /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
+  1371 pts/0    S+     0:00              \_ grep --color=auto dhcl
  [centos@test ~]$ sudo kill -9 868
  [centos@test ~]$ sudo ip addr del 2001:db8:123::111/128 dev eth0
  [centos@test ~]$ sudo rm -rf /var/lib/dhclient/dhclient6--eth0.lease
  [centos@test ~]$ sudo /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
  [centos@test ~]$ ip a s eth0
  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
-     link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
-     inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
-        valid_lft 86319sec preferred_lft 86319sec
-     inet6 2001:db8:123::222/128 scope global dynamic
-        valid_lft 7481sec preferred_lft 7181sec
-     inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
-        valid_lft forever preferred_lft forever
+     link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
+     inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
+        valid_lft 86319sec preferred_lft 86319sec
+     inet6 2001:db8:123::222/128 scope global dynamic
+        valid_lft 7481sec preferred_lft 7181sec
+     inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
+        valid_lft forever preferred_lft forever
  [centos@test ~]$
  
- 
  I found some logic with dhcpv6 leases removing here:
- https://opendev.org/openstack/neutron/src/branch/master/neutron/agent/linux/dhcp.py#L600
- but it looks like it doesn't help in case of DHCPCONFIRM client request: 
+ https://opendev.org/openstack/neutron/src/commit/e7b70521d0e230143a80974e7e4795a2acafcc9b/neutron/agent/linux/dhcp.py#L600
+ but it looks like it doesn't help in case of DHCPCONFIRM client request:
  In the dnsmasq logs I see the following DHCPCONFIRM->DHCPREPLY messages exchange after the VM came back after the reboot (see also https://datatracker.ietf.org/doc/html/rfc3315#page-50):
  
  Feb  1 16:49:12 dnsmasq-dhcp[1360521]: DHCPREQUEST(tapc233cb5c-8f) 10.136.17.163 fa:16:3e:2e:66:ac
  Feb  1 16:49:12 dnsmasq-dhcp[1360521]: DHCPACK(tapc233cb5c-8f) 10.136.17.163 fa:16:3e:2e:66:ac host-10-136-17-163
  Feb  1 16:49:15 dnsmasq-dhcp[1360521]: DHCPCONFIRM(tapc233cb5c-8f) 00:01:00:01:29:8c:20:5e:fa:16:3e:2e:66:ac
  Feb  1 16:49:15 dnsmasq-dhcp[1360521]: DHCPREPLY(tapc233cb5c-8f) 2001:db8:123::111 00:01:00:01:29:8c:20:5e:fa:16:3e:2e:66:ac host-2001-db8-123--222

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1959697

Title:
  VM gets wrong ipv6 address from dhcp-agent after ipv6 address on port
  was changed

Status in neutron:
  New

Bug description:
  I run into a problem when neutron dhcp-agent is still replying to the old address confirmation.
  Simple steps to reproduce:
  - create a port with IPv6 address in dhcpv6-stateful subnet
  - create a VM with cloud-init inside
  - change the IPv6 port address
  - reboot the VM

  Here are my commands:

  $ openstack subnet create --subnet-range 2001:db8:123::/64 --ip-version 6 --ipv6-address-mode dhcpv6-stateful --network public subv6
  $ openstack subnet list --network public
  +--------------------------------------+-------+--------------------------------------+-------------------+
  | ID                                   | Name  | Network                              | Subnet            |
  +--------------------------------------+-------+--------------------------------------+-------------------+
  | 6d9a7fb5-5c1b-4759-b32b-5720b5cedbf4 | subv4 | f1f3d967-26db-41b3-b6f6-1d5356e33a84 | 10.136.16.0/22    |
  | 76db898c-6a7a-4301-9253-23241cafaa83 | subv6 | f1f3d967-26db-41b3-b6f6-1d5356e33a84 | 2001:db8:123::/64 |
  +--------------------------------------+-------+--------------------------------------+-------------------+
  $

  $ openstack port create my-port  --network public --fixed-ip ip-address=10.136.17.163 --fixed-ip ip-address=2001:db8:123::111
  $ openstack server create test --flavor m1.small --port my-port --image CentOS-7-x86_64-GenericCloud-2009.qcow2 --key-name key --use-config-drive

  Check IPv6 address inside VM (it's correct):

  [centos@test ~]$ ip a s eth0
  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
      link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
      inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
         valid_lft 86371sec preferred_lft 86371sec
      inet6 2001:db8:123::111/128 scope global dynamic
         valid_lft 7473sec preferred_lft 7173sec
      inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
         valid_lft forever preferred_lft forever
  [centos@test ~]$

  Change IPv6 address and reboot the VM:
  $ openstack port set my-port --no-fixed-ip --fixed-ip ip-address=10.136.17.163 --fixed-ip ip-address=2001:db8:123::222
  $ openstack server reboot test

  [centos@test ~]$ ip a s eth0
  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
      link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
      inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
         valid_lft 86382sec preferred_lft 86382sec
      inet6 2001:db8:123::111/128 scope global dynamic
         valid_lft 7482sec preferred_lft 7182sec
      inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
         valid_lft forever preferred_lft forever
  [centos@test ~]$

  ^^ you can see the VM got the old IPv6 address and actually all
  traffic is blocked by port-security feature. If I remove a lease file
  and re-spawn a dhclient, all is fine:

  [centos@test ~]$ ps axf | grep dhcl
    780 ?        Ss     0:00 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--eth0.lease -pf /var/run/dhclient-eth0.pid -H test eth0
    868 ?        Ss     0:00 /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
   1371 pts/0    S+     0:00              \_ grep --color=auto dhcl
  [centos@test ~]$ sudo kill -9 868
  [centos@test ~]$ sudo ip addr del 2001:db8:123::111/128 dev eth0
  [centos@test ~]$ sudo rm -rf /var/lib/dhclient/dhclient6--eth0.lease
  [centos@test ~]$ sudo /sbin/dhclient -6 -1 -lf /var/lib/dhclient/dhclient6--eth0.lease -pf /var/run/dhclient6-eth0.pid eth0 -H test
  [centos@test ~]$ ip a s eth0
  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
      link/ether fa:16:3e:2e:66:ac brd ff:ff:ff:ff:ff:ff
      inet 10.136.17.163/22 brd 10.136.19.255 scope global dynamic eth0
         valid_lft 86319sec preferred_lft 86319sec
      inet6 2001:db8:123::222/128 scope global dynamic
         valid_lft 7481sec preferred_lft 7181sec
      inet6 fe80::f816:3eff:fe2e:66ac/64 scope link
         valid_lft forever preferred_lft forever
  [centos@test ~]$

  I found some logic with dhcpv6 leases removing here:
  https://opendev.org/openstack/neutron/src/commit/e7b70521d0e230143a80974e7e4795a2acafcc9b/neutron/agent/linux/dhcp.py#L600
  but it looks like it doesn't help in case of DHCPCONFIRM client request:
  In the dnsmasq logs I see the following DHCPCONFIRM->DHCPREPLY messages exchange after the VM came back after the reboot (see also https://datatracker.ietf.org/doc/html/rfc3315#page-50):

  Feb  1 16:49:12 dnsmasq-dhcp[1360521]: DHCPREQUEST(tapc233cb5c-8f) 10.136.17.163 fa:16:3e:2e:66:ac
  Feb  1 16:49:12 dnsmasq-dhcp[1360521]: DHCPACK(tapc233cb5c-8f) 10.136.17.163 fa:16:3e:2e:66:ac host-10-136-17-163
  Feb  1 16:49:15 dnsmasq-dhcp[1360521]: DHCPCONFIRM(tapc233cb5c-8f) 00:01:00:01:29:8c:20:5e:fa:16:3e:2e:66:ac
  Feb  1 16:49:15 dnsmasq-dhcp[1360521]: DHCPREPLY(tapc233cb5c-8f) 2001:db8:123::111 00:01:00:01:29:8c:20:5e:fa:16:3e:2e:66:ac host-2001-db8-123--222

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1959697/+subscriptions