← Back to team overview

openstack team mailing list archive

Fwd: [Question #221283]: VM instance is not able to get IP address

 

Hi All,

I posted below question on launchpad -quantum, but i didn't get any
response from the team, may be its not that active as openstack mailing
list.

I am facing an issue detailed in this question [
https://answers.launchpad.net/quantum/+question/221283] and did some
analysis and shared it on the same question. You can find my analysis in
the below mail as well.

I am looking for suggestion from the openstack networking expert on how to
further debug this issue. My deployment is stuck because of this issue. I
really appreciate your help.

Thanks
Anil

---------- Forwarded message ----------
From: Anil Vishnoi <question221283@xxxxxxxxxxxxxxxxxxxxx>
Date: Fri, Feb 8, 2013 at 3:41 AM
Subject: Re: [Question #221283]: VM instance is not able to get IP address
To: vishnoianil@xxxxxxxxx


Your question #221283 on quantum changed:
https://answers.launchpad.net/quantum/+question/221283

You gave more information on the question:
Hi Team,

I further debugged this issue, and figure out one workaround. I really
don't want to say it a "workaround" but moreover its a hack.

As i mentioned in the above description that because of the action=drop,
DHCP packets were getting dropped and not reaching to the DHCP agent,
and hence it was not able to respond with the DHCPOFFER response.

First i resolve this error [Feb 07
17:32:40|00001|netdev_linux|WARN|/sys/class/net/tap9fdb5c15-26/carrier:
open failed: ] with the following steps :
1. disable the network namespace for l3_agent and dhcp agent by modifying
the use_namespace=false in the respective configuration file.
2. Delete the port (tap9fdb5c15-26) from the br-int bridge.
[Quick instructions :
root@management:~# ovs-vsctl del-port tap9fdb5c15-26
root@management:~# ovs-vsctl add-port br-int tap9fdb5c15-26
root@management:~# ovs-vsctl set port tap9fdb5c15-26 tag=1
root@management:~# ovs-vsctl set Interface tap9fdb5c15-26 type=internal
]
3. Restart both the services and it will create tap devices outside the
network name space.

If network namespace is enabled, ifconfig will not show this tap device
in its output, but if you fire command 'ip netns exec dhcpnsXXXX ip -d
link' it will show you the device.

In my setup i followed the above step, but even if you don't want to
disable namespace, you can stop dhcp agent, delete the port from br-int
and restart the service. It possibly will resolve this error ( it did
worked in my setup).

So in my setup, namespace is disabled. And following is the output of
ovs-dpctl

root@management:~# ovs-dpctl show
system@br-eth1:
        lookups: hit:151651 missed:37759 lost:0
        flows: 3
        port 0: br-eth1 (internal)
        port 1: eth1
        port 3: phy-br-eth1
system@br-int:
        lookups: hit:1183 missed:23283 lost:0
        flows: 1
        port 0: br-int (internal)
        port 6: tap9fdb5c15-26 (internal)
        port 7: int-br-eth1
system@br-ex:
        lookups: hit:96895 missed:67156 lost:0
        flows: 16
        port 0: br-ex (internal)
        port 1: eth0

DHCP request packet is broadcast packet and it takes following path to
reach the br-int    port 1: eth1 (br-eth1) --> port 7: int-br-eth1(br-
int) and this packet gets drop here because of the following rule
installed on br-int bridge

 cookie=0x0, duration=11422.615s, table=0, n_packets=16711,
n_bytes=1178562, priority=2,in_port=7 actions=drop

Ideally it should be forwarded to port 6: tap9fdb5c15-26 (internal) (br-
int) and that way it can reach DHCP agent. So i modified above flow to
following flow

cookie=0x0, duration=3169.501s, table=0, n_packets=2562, n_bytes=228241,
priority=2,in_port=7 actions=output:6

and also installed following rule to route back the DHCPOFFER packet

cookie=0x0, duration=4536.551s, table=0, n_packets=233, n_bytes=28896,
priority=2,in_port=6 actions=output:7

So after installing these two flow rules, DHCP agent got the request and
responded with the DHCPOFFER response.

root@management:~# tail -f /var/log/syslog
Feb  8 03:26:16 management dnsmasq-dhcp[25811]: DHCPREQUEST(tap9fdb5c15-26)
192.168.0.3 fa:16:3e:93:74:73
Feb  8 03:26:16 management dnsmasq-dhcp[25811]: DHCPACK(tap9fdb5c15-26)
192.168.0.3 fa:16:3e:93:74:73 192-168-0-3

DHCP response packet will take following path  port 6: tap9fdb5c15-26
(internal)(br-int) ---> port 7: int-br-eth1(br-int) ---> port 3: phy-br-
eth1 (br-eth1) ---> port 1: eth1 (br-eth1)  and that way this packet
will go out of controller node. But on br-eth1 bridge another rule was
installed which was dropping the response

cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144,
priority=2,in_port=3 actions=drop

and i changed this flow to

cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144,
priority=2,in_port=3 actions=output:1

so now packet can escape from the controller machine. Now follows the
story of compute node side.

Following is ovs-dpctl output of my compute node :

system@br-eth1:
        lookups: hit:404442 missed:110048 lost:0
        flows: 1
        port 0: br-eth1 (internal)
        port 1: eth1
        port 3: phy-br-eth1
system@br-int:
        lookups: hit:1884 missed:71022 lost:0
        flows: 0
        port 0: br-int (internal)
        port 3: int-br-eth1
        port 4: qvo819abf08-ca
        port 6: tap718d359b-d1   <<VM Connected to this tap device

Response packet should take following path:  port 1: eth1(br-eth1) --->
port 3: int-br-eth1 (br-int) --->port 6: tap718d359b-d1 (br-int),  but
on br-int bridge following flow rule was installed which was dropping
the response packet

cookie=0x0, duration=1671.356s, table=0, n_packets=1068, n_bytes=99127,
priority=2,in_port=3 actions=drop

so i modified this flow to

cookie=0x0, duration=1671.356s, table=0, n_packets=1068, n_bytes=99127,
priority=2,in_port=3 actions=output:6

and that way it was forwarding the packet to my VM, and i can now see that
IP address 192.168.0.3 is now assigned to my machine. Ideally this is the
job of quantum plug-in, but not sure why its dropping all the packets from
both the sides.
Above exercise establishes the fact that dhcp agent is working fine here,
its the network routing which is causing the issue, and that too
openvswitch plug-in as per my understanding.

Seeking suggestion from the networking experts on the list, what
possibly can cause this issue, do openvswitch plug-in has any dependency
on linux bridge or brcompat module to work properly ? because on
controller node neither bridge module nor brcompat module is loaded.
Obviously this hack won't work for all other cases, so we need to
resolve the issue at the plugin level. Please suggest!

Thanks
Anil

--
You received this question notification because you asked the question.



-- 
Thanks & Regards
--Anil Kumar Vishnoi

Follow ups