openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #16102
Re: VM can't ping self floating IP after a snapshot is taken
this bug has been filed here https://bugs.launchpad.net/nova/+bug/1040537
2012/8/24 Vishvananda Ishaya <vishvananda@xxxxxxxxx>:
> +1 to this. Evan, can you report a bug (if one hasn't been reported yet) and
> propose the fix? Or else I can find someone else to propose it.
>
> Vish
>
> On Aug 23, 2012, at 1:38 PM, Evan Callicoat <diopter@xxxxxxxxx> wrote:
>
> Hello all!
>
> I'm the original author of the hairpin patch, and things have changed a
> little bit in Essex and Folsom from the original Diablo target. I believe I
> can shed some light on what should be done here to solve the issue in either
> case.
>
> ---
> For Essex (stable/essex), in nova/virt/libvirt/connection.py:
> ---
>
> Currently _enable_hairpin() is only being called from spawn(). However,
> spawn() is not the only place that vifs (veth#) get added to a bridge (which
> is when we need to enable hairpin_mode on them). The more relevant function
> is _create_new_domain(), which is called from spawn() and other places.
> Without changing the information that gets passed to _create_new_domain()
> (which is just 'xml' from to_xml()), we can easily rewrite the first 2 lines
> in _enable_hairpin(), as follows:
>
> def _enable_hairpin(self, xml):
> interfaces = self.get_interfaces(xml['name'])
>
> Then, we can move the self._enable_hairpin(instance) call from spawn() up
> into _create_new_domain(), and pass it xml as follows:
>
> [...]
> self._enable_hairpin(xml)
> return domain
>
> This will run the hairpin code every time a domain gets created, which is
> also when the domain's vif(s) gets inserted into the bridge with the default
> of hairpin_mode=0.
>
> ---
> For Folsom (trunk), in nova/virt/libvirt/driver.py:
> ---
>
> There've been a lot more changes made here, but the same strategy as above
> should work. Here, _create_new_domain() has been split into _create_domain()
> and _create_domain_and_network(), and _enable_hairpin() was moved from
> spawn() to _create_domain_and_network(), which seems like it'd be the right
> thing to do, but doesn't quite cover all of the cases of vif reinsertion,
> since _create_domain() is the only function which actually creates the
> domain (_create_domain_and_network() just calls it after doing some
> pre-work). The solution here is likewise fairly simple; make the same 2
> changes to _enable_hairpin():
>
> def _enable_hairpin(self, xml):
> interfaces = self.get_interfaces(xml['name'])
>
> And move it from _create_domain_and_network() to _create_domain(), like
> before:
>
> [...]
> self._enable_hairpin(xml)
> return domain
>
> I haven't yet tested this on my Essex clusters and I don't have a Folsom
> cluster handy at present, but the change is simple and makes sense. Looking
> at to_xml() and _prepare_xml_info(), it appears that the 'xml' variable
> _create_[new_]domain() gets is just a python dictionary, and xml['name'] =
> instance['name'], exactly what _enable_hairpin() was using the 'instance'
> variable for previously.
>
> Let me know if this works, or doesn't work, or doesn't make sense, or if you
> need an address to send gifts, etc. Hope it's solved!
>
> -Evan
>
> On Thu, Aug 23, 2012 at 11:20 AM, Sam Su <susltd.su@xxxxxxxxx> wrote:
>>
>> Hi Oleg,
>>
>> Thank you for your investigation. Good lucky!
>>
>> Can you let me know if find how to fix the bug?
>>
>> Thanks,
>> Sam
>>
>> On Wed, Aug 22, 2012 at 12:50 PM, Oleg Gelbukh <ogelbukh@xxxxxxxxxxxx>
>> wrote:
>>>
>>> Hello,
>>>
>>> Is it possible that, during snapshotting, libvirt just tears down virtual
>>> interface at some point, and then re-creates it, with hairpin_mode disabled
>>> again?
>>> This bugfix [https://bugs.launchpad.net/nova/+bug/933640] implies that
>>> fix works on spawn of instance. This means that upon resume after snapshot,
>>> hairpin is not restored. May be if we insert the _enable_hairpin() call in
>>> snapshot procedure, it helps.
>>> We're currently investigating this issue in one of our environments, hope
>>> to come up with answer by tomorrow.
>>>
>>> --
>>> Best regards,
>>> Oleg
>>>
>>> On Wed, Aug 22, 2012 at 11:29 PM, Sam Su <susltd.su@xxxxxxxxx> wrote:
>>>>
>>>> My friend has found a way to enable ping itself, when this problem
>>>> happened. But not found why this happen.
>>>> sudo echo "1" >
>>>> /sys/class/net/br1000/brif/<virtual-interface-name>/hairpin_mode
>>>>
>>>> I file a ticket to report this problem:
>>>> https://bugs.launchpad.net/nova/+bug/1040255
>>>>
>>>> hopefully someone can find why this happen and solve it.
>>>>
>>>> Thanks,
>>>> Sam
>>>>
>>>>
>>>> On Fri, Jul 20, 2012 at 3:50 PM, Gabriel Hurley
>>>> <Gabriel.Hurley@xxxxxxxxxx> wrote:
>>>>>
>>>>> I ran into some similar issues with the _enable_hairpin() call. The
>>>>> call is allowed to fail silently and (in my case) was failing. I couldn’t
>>>>> for the life of me figure out why, though, and since I’m really not a
>>>>> networking person I didn’t trace it along too far.
>>>>>
>>>>>
>>>>>
>>>>> Just thought I’d share my similar pain.
>>>>>
>>>>>
>>>>>
>>>>> - Gabriel
>>>>>
>>>>>
>>>>>
>>>>> From: openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx
>>>>> [mailto:openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx] On
>>>>> Behalf Of Sam Su
>>>>> Sent: Thursday, July 19, 2012 11:50 AM
>>>>> To: Brian Haley
>>>>> Cc: openstack
>>>>> Subject: Re: [Openstack] VM can't ping self floating IP after a
>>>>> snapshot is taken
>>>>>
>>>>>
>>>>>
>>>>> Thank you for your support.
>>>>>
>>>>>
>>>>>
>>>>> I checked the file nova/virt/libvirt/connection.py, the sentence
>>>>> self._enable_hairpin(instance) is already added to the function
>>>>> _hard_reboot().
>>>>>
>>>>> It looks like there are some difference between taking snapshot and
>>>>> reboot instance. I tried to figure out how to fix this bug but failed.
>>>>>
>>>>>
>>>>>
>>>>> It will be much appreciated if anyone can give some hints.
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Sam
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 19, 2012 at 8:37 AM, Brian Haley <brian.haley@xxxxxx>
>>>>> wrote:
>>>>>
>>>>> On 07/17/2012 05:56 PM, Sam Su wrote:
>>>>> > Hi,
>>>>> >
>>>>> > Just This always happens in Essex release. After I take a snapshot of
>>>>> > my VM ( I
>>>>> > tried Ubuntu 12.04 or CentOS 5.8), VM can't ping its self floating
>>>>> > IP; before I
>>>>> > take a snapshot though, VM can ping its self floating IP.
>>>>> >
>>>>> > This looks closely related to
>>>>> > https://bugs.launchpad.net/nova/+bug/933640, but
>>>>> > still a little different. In 933640, it sounds like VM can't ping its
>>>>> > self
>>>>> > floating IP regardless whether we take a snapshot or not.
>>>>> >
>>>>> > Any suggestion to make an easy fix? And what is the root cause of the
>>>>> > problem?
>>>>>
>>>>> It might be because there's a missing _enable_hairpin() call in the
>>>>> reboot()
>>>>> function. Try something like this...
>>>>>
>>>>> nova/virt/libvirt/connection.py, _hard_reboot():
>>>>>
>>>>> self._create_new_domain(xml)
>>>>> + self._enable_hairpin(instance)
>>>>> self.firewall_driver.apply_instance_filter(instance,
>>>>> network_info)
>>>>>
>>>>> At least that's what I remember doing myself recently when testing
>>>>> after a
>>>>> reboot, don't know about snapshot.
>>>>>
>>>>> Folsom has changed enough that something different would need to be
>>>>> done there.
>>>>>
>>>>> -Brian
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~openstack
>>>> Post to : openstack@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~openstack
>>>> More help : https://help.launchpad.net/ListHelp
>>>>
>>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~openstack
>> More help : https://help.launchpad.net/ListHelp
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
Follow ups
References