← Back to team overview

openstack team mailing list archive

Re: [HyperV][Quantum] Quantum dhcp agent not working for Hyper-V

 

Hi Jon,

Thanks for your help! Both the kernel and the iproute packages are updated. RDO does a great job with this. Beside the 2.6.32 + netns kernel provided by RDO I also tested it with a 3.9.8, with the same results. I'd add to your troubleshooting steps a very simple test to check if netns is enabled in the kernel: checking if the "/proc/self/ns" path exists.

Back to the original issue, there are no errors on the Quantum side.


Thanks,

Alessandro




On Jul 7, 2013, at 02:36 , Hathaway.Jon <Jon.Hathaway@xxxxxxx<mailto:Jon.Hathaway@xxxxxxx>>
 wrote:

Hi Allessandro

I know this is probably something you have probably already tested for the RDO installation, but have you upgraded the CentOS kernel and the iproute package as both are missing the NetNS support required for Quantum. Ubuntu fixed this issue back in 10.04 but for whatever reason the current production kernel for CentOS still hasn’t.

We had to update the kernel and the iproute package. If you check the log files for l3-agent, especially the dhcp logs you may find errors like “command not recognised” as the shipped iproute cor CentOS 6.4 doesn’t support the netns extensions.

https://www.redhat.com/archives/rdo-list/2013-May/msg00015.html

My workaround was:

If installing on a EPEL6 distribution like Centos 6.4, there is a bug in the Kernel release which has disabled the Network Names (netns) support that is required to create overlapping networks in Quantum and is required to run the DHCP agent that assigns IP addresses on boot, and also setup the l3-agent that is responsible for forwarding requests from the instances to the API to retrieve any specific instance meta data.

A quick check on the node configured with Quantum in /var/log/quantum/dhcp-agent.log will show something like:

RuntimeError:
Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', '-o', 'netns', 'list']
Exit code: 255
Stdout: ''
Stderr: 'Object "netns" is unknown, try "ip help".\n'

If you try and run ‘ip netns’ from the command and it fails, you will need to update the kernel and possibly the iproute2 package:

[root@oscontroller ~]# ip nets
Object "nets" is unknown, try "ip help".

Netns is available in the iproute2 package , but it requires additional support from the kernel. A new kernel has been released for testing only by Redhat version kernel-2.6.32-358.6.2.openstack.el6.x86_64 whilst the installed version is kernel- 2.6.32-358.el6.x86_64 that comes with Centos 6.4.

To add the new kernel and iproute2 packages requires updating the kernel and kernel-firmware packages from the Grizzly repository.

yum install http://repos.fedorapeople.org/repos/openstack/openstack-grizzly/epel-6/kernel-firmware-2.6.32-358.6.2.openstack.el6.noarch.rpm

yum install http://repos.fedorapeople.org/repos/openstack/openstack-grizzly/epel-6/kernel-2.6.32-358.6.2.openstack.el6.x86_64.rpm

yum install http://repos.fedorapeople.org/repos/openstack/openstack-grizzly/epel-6/iproute-2.6.32-23.el6_4.netns.1.x86_64.rpm

Check in /etc/grub.conf that the new kernel is being reference and then restart the node running Quantum.

After reboot, try and run the ‘ip netns’ it should run without an error.

If you have previously added an instance before upgrading the packages, you will need to remove the networks, routers and ports and re-add them before continuing. However – it is likely that you will end up with stake ports on the Quantum Server. As shown below:

[root@oscontroller quantum]# ovs-vsctl show
e4b86f82-2d16-49b1-9077-93abf2b32400
    Bridge br-ex
        Port "qg-3d8f69e7-5d"
            Interface "qg-3d8f69e7-5d"
                type: internal
        Port br-ex
            Interface br-ex
                type: internal
    Bridge br-int
        Port "qr-c7145535-d1"
            tag: 1
            Interface "qr-c7145535-d1"
                type: internal
        Port "tapc4fb5d73-3e"
            tag: 1
            Interface "tapc4fb5d73-3e"
                type: internal
        Port br-int
            Interface br-int
                type: internal
        Port "tape76c5e3c-1b"
            tag: 2
            Interface "tape76c5e3c-1b"
                type: internal
    ovs_version: "1.10.0"

These ports/interfaces will need to be deleted before the networking will work successfully.

Just a thought.

Jon

From: Openstack [mailto:openstack-bounces+jon.hathaway=igt.com@xxxxxxxxxxxxxxxxxxx<mailto:bounces+jon.hathaway=igt.com@xxxxxxxxxxxxxxxxxxx>] On Behalf Of Alessandro Pilotti
Sent: 06 July 2013 16:23
To: OpenStack
Subject: Re: [Openstack] [HyperV][Quantum] Quantum dhcp agent not working for Hyper-V

Hi Bruno,

I just hit the same (or a very similar) issue doing a multinode deployment with RDO on CentOS 6.4 (OVS 1.10) while we had no problem until now using Ubuntu 12.04 (OVS 1.4).
Can you please provide some more details about the Linux OS you are using and your multinode configuration?

I tested it with flat and VLAN networks, so far it doesn't look like an Hyper-V related issue.


Thanks,

Alessandro


On Jun 7, 2013, at 23:51 , Bruno Oliveira ~lychinus <brunnop.oliveira@xxxxxxxxx<mailto:brunnop.oliveira@xxxxxxxxx>> wrote:


"(...)Do you have your vSwitch properly configured on your hyper-v host?(...)"


I can't say for sure, Peter, but I think so...

In troubleshooting we did (and are still doing) I can tell that
regardless of the network model that we're using (FLAT or VLAN
Network),
the instance that is provisioned on Hyper-V (for some reason) can't
reach the quantum-l3-agent "by default"
(I said "default" because, we just managed to do it after a hard, long
and boring troubleshoting,
yet, we're not sure if that's how it should be done, indeed)

Since it's not something quick to explain, I'll present the scenario:
(I'm not sure if it might be a candidate for a fix in quantum-l3-agent,
so quantum-devs might be interested too)


Here's how our network interfaces turns out, in our network controller:

==============================
External bridge network
==============================

Bridge "br-eth1"
       Port "br-eth1"
           Interface "br-eth1"
               type: internal
       Port "eth1.11"
           Interface "eth1.11"
       Port "phy-br-eth1"
           Interface "phy-br-eth1"

==============================
Internal network
==============================

  Bridge br-int
       Port "int-br-eth1"
           Interface "int-br-eth1"
       Port br-int
           Interface br-int
               type: internal
       Port "tapb610a695-46"
           tag: 1
           Interface "tapb610a695-46"
               type: internal
       Port "qr-ef10bef4-fa"
           tag: 1
           Interface "qr-ef10bef4-fa"
               type: internal

==============================

There's another iface named "br-ex" that we're using for floating_ips,
but it has nothing to do with what we're doing right now, so I'm skipping it...


************ So, for the hands-on ****************

I know it may be a little bit hard to understand, but I'll do my best
trying to explain:

1) the running instance in Hyper-V, which is linked to Hyper-V vSwitch
is actually
communicating to bridge: "br-eth1" (that is in the network controller).

NOTE: That's where the DHCP REQUEST (from the instance) lands


2) The interface MAC Address, of that running instance on Hyper-V, is:
fa:16:3e:95:95:e4. (we're gonna use it on later steps)
Since DHCP is not fully working yet, we had to manually set an IP for
that instance: "10.5.5.3"


3) From that instance interface, the dhcp_broadcast should be forward ->
  FROM interface "eth1.12" TO  "phy-br-eth1"
  And FROM interface "phy-br-eth1" TO the bridge "br-int"   *** THIS
IS WHERE THE PACKETS ARE DROPPED  ***.

Check it out for the "actions:drop"
---------------------------------------------------------------------------------------------
root@osnetwork:~# ovs-dpctl dump-flows br-int  |grep 10.5.5.3

in_port(4),eth(src=fa:16:3e:f0:ac:8e,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.5.5.3,tip=10.5.5.1,op=1,sha=fa:16:3e:f0:ac:8e,tha=00:00:00:00:00:00),
packets:20, bytes:1120, used:0.412s, actions:drop
---------------------------------------------------------------------------------------------

4) Finally, when the packet reaches the bridge "br-int", the
DHCP_REQUEST should be forward to the
  dhcp_interface, that is: tapb610a695-46    *** WHICH IS NOT
HAPPENING EITHER ***


5) How to fix :: bridge br-eth1

-------------------------------------------
5.1. Getting to know the ifaces of 'br-eth1'
-------------------------------------------
root@osnetwork:~# ovs-ofctl show br-eth1

OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:0000e0db554e164b
n_tables:255, n_buffers:256 features: capabilities:0xc7, actions:0xfff

1(eth1.11): addr:e0:db:55:4e:16:4b
    config:     0
    state:      0
    current:    10GB-FD AUTO_NEG
    advertised: 1GB-FD 10GB-FD FIBER AUTO_NEG
    supported:  1GB-FD 10GB-FD FIBER AUTO_NEG

3(phy-br-eth1): addr:26:9b:97:93:b9:70
    config:     0
    state:      0
    current:    10GB-FD COPPER

LOCAL(br-eth1): addr:e0:db:55:4e:16:4b
    config:     0
    state:      0

OFPT_GET_CONFIG_REPLY (xid=0x3): frags=normal miss_send_len=0


-------------------------------------------
5.2. Adding flow rules to enable passing (instead of dropping)
-------------------------------------------

# the source mac_address (dl_src) is the from the interface of the
# running instance on Hyper-V. This fix the DROP (only)

root@osnetwork:~# ovs-ofctl add-flow br-eth1
priority=10,in_port=3,dl_src=fa:16:3e:95:95:e4,actions=normal



6) How to fix :: bridge br-int

-------------------------------------------
6.1. Getting to know the ifaces of 'br-int'
-------------------------------------------

root@osnetwork:~# ovs-ofctl show br-int

OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:000092976d64274d

n_tables:255, n_buffers:256  features: capabilities:0xc7, actions:0xfff

1(tapb610a695-46): addr:19:01:00:00:00:00
    config:     PORT_DOWN
    state:      LINK_DOWN

4(int-br-eth1): addr:5a:56:e1:53:e9:90
    config:     0
    state:      0
    current:    10GB-FD COPPER

5(qr-ef10bef4-fa): addr:19:01:00:00:00:00
    config:     PORT_DOWN
    state:      LINK_DOWN

LOCAL(br-int): addr:92:97:6d:64:27:4d
    config:     0
    state:      0

-------------------------------------------
6.2. Adding flow rules to enable FORWARD
    FROM: interface int-br-eth1  (4)
    TO:   interface tapb610a695-46 (1) -> dhcp_interface

    and the REVERSE_FORWARD: from (1) to (4)
-------------------------------------------
root@osnetwork:~# ovs-ofctl add-flow br-int
priority=12,in_port=4,dl_src=fa:16:3e:f0:ac:8e,action=1,normal
root@osnetwork:~# ovs-ofctl add-flow br-int
priority=12,in_port=1,dl_dst=fa:16:3e:f0:ac:8e,action=4,normal


==================
Conclusion
==================

That way, and only *that way*, the Hyper-V instance is able to
exchange ARP with the dhcp (network controller).

Even though, it is functional, we're not sure if that's how it HAS to
be done. May I have your thoughts on it?

Should we really have to create those rules/actions in openVSwitch to
make the instance (of hyper-v) to reach out
DHCP ? It seems either bug or something is wierd in my configurations...

May I have your opinions on it?


We'd greatly appreciate your feedback. Thank you very much.

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp



Follow ups

References