← Back to team overview

openstack team mailing list archive

Re: Networking guru needed: problem with FlatManager ARP when guest and bridge MACs the same

 

On Wed, Mar 14, 2012 at 10:50:28AM -0700, Justin Santa Barbara wrote:
> We recently changed the MAC address assigned to guests so that they started
> with 0xfe, in the hope of avoiding (theoretical?) issues with MAC addresses
> changing on the bridge device as machines are shut down (because supposedly
> the bridge grabs the lowest MAC address numerically):
> https://bugs.launchpad.net/nova/+bug/921838
> 
> However, it looks we bumped into some similar behavior done by libvirt: It
> also sets the first byte to 0xfe for the host network device, in the hope
> of avoiding the same bug.  Thus, with the patch, the host vnetX and the
> guest eth0 have the same MAC address.  I think this breaks FlatManager, but
> I don't know why, and I really don't know why it wouldn't break other
> modes, and I'm hoping a network guru can explain/confirm.

I don't really know why either - all I know is that the host side must
be different from the guest side.

> When they have the same MAC address, ARP resolution isn't working: the
> guest issues an ARP request for the gateway, on the host I can see the ARP
> request and response, but the guest doesn't appear to see/accept the ARP
> response and so it just keeps retrying.
> 
> This message appears in dmesg:
> [ 2199.836114] br100: received packet on vnet1 with own address as source
> address
> 
> I'm guessing that 'address' means 'MAC address', and this is why ARP is
> failing, it sounds like the bridge might be dropping the packet.
> 
> Changing to 0x02, or 0xfc does fix it (although my arithmetic was wrong,
> and vishy points out we should use 0xfa instead of 0xfc).
> 
> Networking guru questions:
> 
>    - Does this explanation make sense?
>    - Why didn't other networking modes break?
>    - Should we simply revert the change and go back to 0x02?
>    - Should we switch to 0xfa to try to avoid the bridge interface
>    problems?  Or does it simply not matter if libvirt is changing the MAC for
>    us?

Hmm, I guess I mis-read the original patch vish submitted. I thought it
was only changing the MAC address of host TAP devices that Nova created
itself, and not the guest MAC address sent in the MXL.

The MAC address sent in the libvirt XML (which is the guest visible MAC)
should not be using 0xfX at all - ideally it should just use the standard
MAC prefix for the hypervisor in question. eg for Xen, use  00:16:3E and
for LXC/KVM use 52:54:00

If libvirt is creating the TAP device itself, (eg <interface> with
type=bridge|direct), then Nova should not do anything special with
the MAC.

If Nova is pre-creating a TAP device (eg for use with <interface>
type=ethernet, then Nova should set the top byte to 0xfe (because
libvirt won't be doing so with pre-created TAP devices).

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|


References