← Back to team overview

openstack team mailing list archive

Re: VM can't ping self floating IP after a snapshot is taken

 

For everybody's information. The patch for the essex release can be
downloaded here https://bugs.launchpad.net/nova/+bug/1040255/comments/7

On Tue, Aug 28, 2012 at 3:17 AM, Sam Su <susltd.su@xxxxxxxxx> wrote:

> Hi,
>
> Thank you so much for your help.
> I replaced the file /usr/share/pyshared/nova/virt/libvirt/connection.py
> with yours, but  it looks like not worked for me.
> Does it need do any additional thing?
>
> Thanks,
> Sam
>
> On Sat, Aug 25, 2012 at 7:03 PM, heut2008 <heut2008@xxxxxxxxx> wrote:
>
>> for stable/essex the patach is here
>> https://review.openstack.org/#/c/11986/,
>>
>> 2012/8/25 Sam Su <susltd.su@xxxxxxxxx>:
>> > That's great, thank you for your efforts. Can you make a backport for
>> essex?
>> >
>> > Sent from my iPhone
>> >
>> > On Aug 24, 2012, at 7:15 PM, heut2008 <heut2008@xxxxxxxxx> wrote:
>> >
>> >> I have fixed it here  https://review.openstack.org/#/c/11925/
>> >>
>> >> 2012/8/25 Sam Su <susltd.su@xxxxxxxxx>:
>> >>> Hi,
>> >>>
>> >>> I also reported this bug:
>> >>> https://bugs.launchpad.net/nova/+bug/1040255
>> >>>
>> >>> If someone can combine you guys solution and get a perfect way to fix
>> this
>> >>> bug, that will be great.
>> >>>
>> >>> BRs,
>> >>> Sam
>> >>>
>> >>>
>> >>> On Thu, Aug 23, 2012 at 9:27 PM, heut2008 <heut2008@xxxxxxxxx> wrote:
>> >>>>
>> >>>> this bug has been filed here
>> https://bugs.launchpad.net/nova/+bug/1040537
>> >>>>
>> >>>> 2012/8/24 Vishvananda Ishaya <vishvananda@xxxxxxxxx>:
>> >>>>> +1 to this. Evan, can you report a bug (if one hasn't been reported
>> yet)
>> >>>>> and
>> >>>>> propose the fix? Or else I can find someone else to propose it.
>> >>>>>
>> >>>>> Vish
>> >>>>>
>> >>>>> On Aug 23, 2012, at 1:38 PM, Evan Callicoat <diopter@xxxxxxxxx>
>> wrote:
>> >>>>>
>> >>>>> Hello all!
>> >>>>>
>> >>>>> I'm the original author of the hairpin patch, and things have
>> changed a
>> >>>>> little bit in Essex and Folsom from the original Diablo target. I
>> >>>>> believe I
>> >>>>> can shed some light on what should be done here to solve the issue
>> in
>> >>>>> either
>> >>>>> case.
>> >>>>>
>> >>>>> ---
>> >>>>> For Essex (stable/essex), in nova/virt/libvirt/connection.py:
>> >>>>> ---
>> >>>>>
>> >>>>> Currently _enable_hairpin() is only being called from spawn().
>> However,
>> >>>>> spawn() is not the only place that vifs (veth#) get added to a
>> bridge
>> >>>>> (which
>> >>>>> is when we need to enable hairpin_mode on them). The more relevant
>> >>>>> function
>> >>>>> is _create_new_domain(), which is called from spawn() and other
>> places.
>> >>>>> Without changing the information that gets passed to
>> >>>>> _create_new_domain()
>> >>>>> (which is just 'xml' from to_xml()), we can easily rewrite the
>> first 2
>> >>>>> lines
>> >>>>> in _enable_hairpin(), as follows:
>> >>>>>
>> >>>>> def _enable_hairpin(self, xml):
>> >>>>>    interfaces = self.get_interfaces(xml['name'])
>> >>>>>
>> >>>>> Then, we can move the self._enable_hairpin(instance) call from
>> spawn()
>> >>>>> up
>> >>>>> into _create_new_domain(), and pass it xml as follows:
>> >>>>>
>> >>>>> [...]
>> >>>>> self._enable_hairpin(xml)
>> >>>>> return domain
>> >>>>>
>> >>>>> This will run the hairpin code every time a domain gets created,
>> which
>> >>>>> is
>> >>>>> also when the domain's vif(s) gets inserted into the bridge with the
>> >>>>> default
>> >>>>> of hairpin_mode=0.
>> >>>>>
>> >>>>> ---
>> >>>>> For Folsom (trunk), in nova/virt/libvirt/driver.py:
>> >>>>> ---
>> >>>>>
>> >>>>> There've been a lot more changes made here, but the same strategy as
>> >>>>> above
>> >>>>> should work. Here, _create_new_domain() has been split into
>> >>>>> _create_domain()
>> >>>>> and _create_domain_and_network(), and _enable_hairpin() was moved
>> from
>> >>>>> spawn() to _create_domain_and_network(), which seems like it'd be
>> the
>> >>>>> right
>> >>>>> thing to do, but doesn't quite cover all of the cases of vif
>> >>>>> reinsertion,
>> >>>>> since _create_domain() is the only function which actually creates
>> the
>> >>>>> domain (_create_domain_and_network() just calls it after doing some
>> >>>>> pre-work). The solution here is likewise fairly simple; make the
>> same 2
>> >>>>> changes to _enable_hairpin():
>> >>>>>
>> >>>>> def _enable_hairpin(self, xml):
>> >>>>>    interfaces = self.get_interfaces(xml['name'])
>> >>>>>
>> >>>>> And move it from _create_domain_and_network() to _create_domain(),
>> like
>> >>>>> before:
>> >>>>>
>> >>>>> [...]
>> >>>>> self._enable_hairpin(xml)
>> >>>>> return domain
>> >>>>>
>> >>>>> I haven't yet tested this on my Essex clusters and I don't have a
>> Folsom
>> >>>>> cluster handy at present, but the change is simple and makes sense.
>> >>>>> Looking
>> >>>>> at to_xml() and _prepare_xml_info(), it appears that the 'xml'
>> variable
>> >>>>> _create_[new_]domain() gets is just a python dictionary, and
>> xml['name']
>> >>>>> =
>> >>>>> instance['name'], exactly what _enable_hairpin() was using the
>> >>>>> 'instance'
>> >>>>> variable for previously.
>> >>>>>
>> >>>>> Let me know if this works, or doesn't work, or doesn't make sense,
>> or if
>> >>>>> you
>> >>>>> need an address to send gifts, etc. Hope it's solved!
>> >>>>>
>> >>>>> -Evan
>> >>>>>
>> >>>>> On Thu, Aug 23, 2012 at 11:20 AM, Sam Su <susltd.su@xxxxxxxxx>
>> wrote:
>> >>>>>>
>> >>>>>> Hi Oleg,
>> >>>>>>
>> >>>>>> Thank you for your investigation. Good lucky!
>> >>>>>>
>> >>>>>> Can you let me know if find how to fix the bug?
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Sam
>> >>>>>>
>> >>>>>> On Wed, Aug 22, 2012 at 12:50 PM, Oleg Gelbukh <
>> ogelbukh@xxxxxxxxxxxx>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> Hello,
>> >>>>>>>
>> >>>>>>> Is it possible that, during snapshotting, libvirt just tears down
>> >>>>>>> virtual
>> >>>>>>> interface at some point, and then re-creates it, with hairpin_mode
>> >>>>>>> disabled
>> >>>>>>> again?
>> >>>>>>> This bugfix [https://bugs.launchpad.net/nova/+bug/933640]
>> implies that
>> >>>>>>> fix works on spawn of instance. This means that upon resume after
>> >>>>>>> snapshot,
>> >>>>>>> hairpin is not restored. May be if we insert the _enable_hairpin()
>> >>>>>>> call in
>> >>>>>>> snapshot procedure, it helps.
>> >>>>>>> We're currently investigating this issue in one of our
>> environments,
>> >>>>>>> hope
>> >>>>>>> to come up with answer by tomorrow.
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Best regards,
>> >>>>>>> Oleg
>> >>>>>>>
>> >>>>>>> On Wed, Aug 22, 2012 at 11:29 PM, Sam Su <susltd.su@xxxxxxxxx>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> My friend has found a way to enable ping itself, when this
>> problem
>> >>>>>>>> happened. But not found why this happen.
>> >>>>>>>> sudo echo "1" >
>> >>>>>>>> /sys/class/net/br1000/brif/<virtual-interface-name>/hairpin_mode
>> >>>>>>>>
>> >>>>>>>> I file a ticket to report this problem:
>> >>>>>>>> https://bugs.launchpad.net/nova/+bug/1040255
>> >>>>>>>>
>> >>>>>>>> hopefully someone can find why this happen and solve it.
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>> Sam
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Fri, Jul 20, 2012 at 3:50 PM, Gabriel Hurley
>> >>>>>>>> <Gabriel.Hurley@xxxxxxxxxx> wrote:
>> >>>>>>>>>
>> >>>>>>>>> I ran into some similar issues with the _enable_hairpin() call.
>> The
>> >>>>>>>>> call is allowed to fail silently and (in my case) was failing. I
>> >>>>>>>>> couldn’t
>> >>>>>>>>> for the life of me figure out why, though, and since I’m really
>> not
>> >>>>>>>>> a
>> >>>>>>>>> networking person I didn’t trace it along too far.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Just thought I’d share my similar pain.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> -          Gabriel
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> From:
>> >>>>>>>>> openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx
>> >>>>>>>>>
>> >>>>>>>>> [mailto:openstack-bounces+gabriel.hurley=
>> nebula.com@xxxxxxxxxxxxxxxxxxx] On
>> >>>>>>>>> Behalf Of Sam Su
>> >>>>>>>>> Sent: Thursday, July 19, 2012 11:50 AM
>> >>>>>>>>> To: Brian Haley
>> >>>>>>>>> Cc: openstack
>> >>>>>>>>> Subject: Re: [Openstack] VM can't ping self floating IP after a
>> >>>>>>>>> snapshot is taken
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Thank you for your support.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> I checked the file  nova/virt/libvirt/connection.py, the
>> sentence
>> >>>>>>>>> self._enable_hairpin(instance) is already added to the function
>> >>>>>>>>> _hard_reboot().
>> >>>>>>>>>
>> >>>>>>>>> It looks like there are some difference between taking snapshot
>> and
>> >>>>>>>>> reboot instance. I tried to figure out how to fix this bug but
>> >>>>>>>>> failed.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> It will be much appreciated if anyone can give some hints.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>>
>> >>>>>>>>> Sam
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Thu, Jul 19, 2012 at 8:37 AM, Brian Haley <
>> brian.haley@xxxxxx>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> On 07/17/2012 05:56 PM, Sam Su wrote:
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> Just This always happens in Essex release. After I take a
>> snapshot
>> >>>>>>>>>> of
>> >>>>>>>>>> my VM ( I
>> >>>>>>>>>> tried Ubuntu 12.04 or CentOS 5.8), VM can't ping its self
>> floating
>> >>>>>>>>>> IP; before I
>> >>>>>>>>>> take a snapshot though, VM can ping its self floating IP.
>> >>>>>>>>>>
>> >>>>>>>>>> This looks closely related to
>> >>>>>>>>>> https://bugs.launchpad.net/nova/+bug/933640, but
>> >>>>>>>>>> still a little different. In 933640, it sounds like VM can't
>> ping
>> >>>>>>>>>> its
>> >>>>>>>>>> self
>> >>>>>>>>>> floating IP regardless whether we take a snapshot or not.
>> >>>>>>>>>>
>> >>>>>>>>>> Any suggestion to make an easy fix? And what is the root cause
>> of
>> >>>>>>>>>> the
>> >>>>>>>>>> problem?
>> >>>>>>>>>
>> >>>>>>>>> It might be because there's a missing _enable_hairpin() call in
>> the
>> >>>>>>>>> reboot()
>> >>>>>>>>> function.  Try something like this...
>> >>>>>>>>>
>> >>>>>>>>> nova/virt/libvirt/connection.py, _hard_reboot():
>> >>>>>>>>>
>> >>>>>>>>>             self._create_new_domain(xml)
>> >>>>>>>>> +            self._enable_hairpin(instance)
>> >>>>>>>>>             self.firewall_driver.apply_instance_filter(instance,
>> >>>>>>>>> network_info)
>> >>>>>>>>>
>> >>>>>>>>> At least that's what I remember doing myself recently when
>> testing
>> >>>>>>>>> after a
>> >>>>>>>>> reboot, don't know about snapshot.
>> >>>>>>>>>
>> >>>>>>>>> Folsom has changed enough that something different would need
>> to be
>> >>>>>>>>> done there.
>> >>>>>>>>>
>> >>>>>>>>> -Brian
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> _______________________________________________
>> >>>>>>>> Mailing list: https://launchpad.net/~openstack
>> >>>>>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> >>>>>>>> Unsubscribe : https://launchpad.net/~openstack
>> >>>>>>>> More help   : https://help.launchpad.net/ListHelp
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Mailing list: https://launchpad.net/~openstack
>> >>>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> >>>>>> Unsubscribe : https://launchpad.net/~openstack
>> >>>>>> More help   : https://help.launchpad.net/ListHelp
>> >>>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Mailing list: https://launchpad.net/~openstack
>> >>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> >>>>> Unsubscribe : https://launchpad.net/~openstack
>> >>>>> More help   : https://help.launchpad.net/ListHelp
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Mailing list: https://launchpad.net/~openstack
>> >>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> >>>>> Unsubscribe : https://launchpad.net/~openstack
>> >>>>> More help   : https://help.launchpad.net/ListHelp
>> >>>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Mailing list: https://launchpad.net/~openstack
>> >>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> >>>> Unsubscribe : https://launchpad.net/~openstack
>> >>>> More help   : https://help.launchpad.net/ListHelp
>> >>>
>> >>>
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>

References