← Back to team overview

openstack team mailing list archive

Re: VM can't ping self floating IP after a snapshot is taken

 

Hi,

Thank you so much for your help.
I replaced the file /usr/share/pyshared/nova/virt/libvirt/connection.py
with yours, but  it looks like not worked for me.
Does it need do any additional thing?

Thanks,
Sam

On Sat, Aug 25, 2012 at 7:03 PM, heut2008 <heut2008@xxxxxxxxx> wrote:

> for stable/essex the patach is here
> https://review.openstack.org/#/c/11986/,
>
> 2012/8/25 Sam Su <susltd.su@xxxxxxxxx>:
> > That's great, thank you for your efforts. Can you make a backport for
> essex?
> >
> > Sent from my iPhone
> >
> > On Aug 24, 2012, at 7:15 PM, heut2008 <heut2008@xxxxxxxxx> wrote:
> >
> >> I have fixed it here  https://review.openstack.org/#/c/11925/
> >>
> >> 2012/8/25 Sam Su <susltd.su@xxxxxxxxx>:
> >>> Hi,
> >>>
> >>> I also reported this bug:
> >>> https://bugs.launchpad.net/nova/+bug/1040255
> >>>
> >>> If someone can combine you guys solution and get a perfect way to fix
> this
> >>> bug, that will be great.
> >>>
> >>> BRs,
> >>> Sam
> >>>
> >>>
> >>> On Thu, Aug 23, 2012 at 9:27 PM, heut2008 <heut2008@xxxxxxxxx> wrote:
> >>>>
> >>>> this bug has been filed here
> https://bugs.launchpad.net/nova/+bug/1040537
> >>>>
> >>>> 2012/8/24 Vishvananda Ishaya <vishvananda@xxxxxxxxx>:
> >>>>> +1 to this. Evan, can you report a bug (if one hasn't been reported
> yet)
> >>>>> and
> >>>>> propose the fix? Or else I can find someone else to propose it.
> >>>>>
> >>>>> Vish
> >>>>>
> >>>>> On Aug 23, 2012, at 1:38 PM, Evan Callicoat <diopter@xxxxxxxxx>
> wrote:
> >>>>>
> >>>>> Hello all!
> >>>>>
> >>>>> I'm the original author of the hairpin patch, and things have
> changed a
> >>>>> little bit in Essex and Folsom from the original Diablo target. I
> >>>>> believe I
> >>>>> can shed some light on what should be done here to solve the issue in
> >>>>> either
> >>>>> case.
> >>>>>
> >>>>> ---
> >>>>> For Essex (stable/essex), in nova/virt/libvirt/connection.py:
> >>>>> ---
> >>>>>
> >>>>> Currently _enable_hairpin() is only being called from spawn().
> However,
> >>>>> spawn() is not the only place that vifs (veth#) get added to a bridge
> >>>>> (which
> >>>>> is when we need to enable hairpin_mode on them). The more relevant
> >>>>> function
> >>>>> is _create_new_domain(), which is called from spawn() and other
> places.
> >>>>> Without changing the information that gets passed to
> >>>>> _create_new_domain()
> >>>>> (which is just 'xml' from to_xml()), we can easily rewrite the first
> 2
> >>>>> lines
> >>>>> in _enable_hairpin(), as follows:
> >>>>>
> >>>>> def _enable_hairpin(self, xml):
> >>>>>    interfaces = self.get_interfaces(xml['name'])
> >>>>>
> >>>>> Then, we can move the self._enable_hairpin(instance) call from
> spawn()
> >>>>> up
> >>>>> into _create_new_domain(), and pass it xml as follows:
> >>>>>
> >>>>> [...]
> >>>>> self._enable_hairpin(xml)
> >>>>> return domain
> >>>>>
> >>>>> This will run the hairpin code every time a domain gets created,
> which
> >>>>> is
> >>>>> also when the domain's vif(s) gets inserted into the bridge with the
> >>>>> default
> >>>>> of hairpin_mode=0.
> >>>>>
> >>>>> ---
> >>>>> For Folsom (trunk), in nova/virt/libvirt/driver.py:
> >>>>> ---
> >>>>>
> >>>>> There've been a lot more changes made here, but the same strategy as
> >>>>> above
> >>>>> should work. Here, _create_new_domain() has been split into
> >>>>> _create_domain()
> >>>>> and _create_domain_and_network(), and _enable_hairpin() was moved
> from
> >>>>> spawn() to _create_domain_and_network(), which seems like it'd be the
> >>>>> right
> >>>>> thing to do, but doesn't quite cover all of the cases of vif
> >>>>> reinsertion,
> >>>>> since _create_domain() is the only function which actually creates
> the
> >>>>> domain (_create_domain_and_network() just calls it after doing some
> >>>>> pre-work). The solution here is likewise fairly simple; make the
> same 2
> >>>>> changes to _enable_hairpin():
> >>>>>
> >>>>> def _enable_hairpin(self, xml):
> >>>>>    interfaces = self.get_interfaces(xml['name'])
> >>>>>
> >>>>> And move it from _create_domain_and_network() to _create_domain(),
> like
> >>>>> before:
> >>>>>
> >>>>> [...]
> >>>>> self._enable_hairpin(xml)
> >>>>> return domain
> >>>>>
> >>>>> I haven't yet tested this on my Essex clusters and I don't have a
> Folsom
> >>>>> cluster handy at present, but the change is simple and makes sense.
> >>>>> Looking
> >>>>> at to_xml() and _prepare_xml_info(), it appears that the 'xml'
> variable
> >>>>> _create_[new_]domain() gets is just a python dictionary, and
> xml['name']
> >>>>> =
> >>>>> instance['name'], exactly what _enable_hairpin() was using the
> >>>>> 'instance'
> >>>>> variable for previously.
> >>>>>
> >>>>> Let me know if this works, or doesn't work, or doesn't make sense,
> or if
> >>>>> you
> >>>>> need an address to send gifts, etc. Hope it's solved!
> >>>>>
> >>>>> -Evan
> >>>>>
> >>>>> On Thu, Aug 23, 2012 at 11:20 AM, Sam Su <susltd.su@xxxxxxxxx>
> wrote:
> >>>>>>
> >>>>>> Hi Oleg,
> >>>>>>
> >>>>>> Thank you for your investigation. Good lucky!
> >>>>>>
> >>>>>> Can you let me know if find how to fix the bug?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Sam
> >>>>>>
> >>>>>> On Wed, Aug 22, 2012 at 12:50 PM, Oleg Gelbukh <
> ogelbukh@xxxxxxxxxxxx>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> Is it possible that, during snapshotting, libvirt just tears down
> >>>>>>> virtual
> >>>>>>> interface at some point, and then re-creates it, with hairpin_mode
> >>>>>>> disabled
> >>>>>>> again?
> >>>>>>> This bugfix [https://bugs.launchpad.net/nova/+bug/933640] implies
> that
> >>>>>>> fix works on spawn of instance. This means that upon resume after
> >>>>>>> snapshot,
> >>>>>>> hairpin is not restored. May be if we insert the _enable_hairpin()
> >>>>>>> call in
> >>>>>>> snapshot procedure, it helps.
> >>>>>>> We're currently investigating this issue in one of our
> environments,
> >>>>>>> hope
> >>>>>>> to come up with answer by tomorrow.
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Oleg
> >>>>>>>
> >>>>>>> On Wed, Aug 22, 2012 at 11:29 PM, Sam Su <susltd.su@xxxxxxxxx>
> wrote:
> >>>>>>>>
> >>>>>>>> My friend has found a way to enable ping itself, when this problem
> >>>>>>>> happened. But not found why this happen.
> >>>>>>>> sudo echo "1" >
> >>>>>>>> /sys/class/net/br1000/brif/<virtual-interface-name>/hairpin_mode
> >>>>>>>>
> >>>>>>>> I file a ticket to report this problem:
> >>>>>>>> https://bugs.launchpad.net/nova/+bug/1040255
> >>>>>>>>
> >>>>>>>> hopefully someone can find why this happen and solve it.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Sam
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Jul 20, 2012 at 3:50 PM, Gabriel Hurley
> >>>>>>>> <Gabriel.Hurley@xxxxxxxxxx> wrote:
> >>>>>>>>>
> >>>>>>>>> I ran into some similar issues with the _enable_hairpin() call.
> The
> >>>>>>>>> call is allowed to fail silently and (in my case) was failing. I
> >>>>>>>>> couldn’t
> >>>>>>>>> for the life of me figure out why, though, and since I’m really
> not
> >>>>>>>>> a
> >>>>>>>>> networking person I didn’t trace it along too far.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Just thought I’d share my similar pain.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -          Gabriel
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> From:
> >>>>>>>>> openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx
> >>>>>>>>>
> >>>>>>>>> [mailto:openstack-bounces+gabriel.hurley=
> nebula.com@xxxxxxxxxxxxxxxxxxx] On
> >>>>>>>>> Behalf Of Sam Su
> >>>>>>>>> Sent: Thursday, July 19, 2012 11:50 AM
> >>>>>>>>> To: Brian Haley
> >>>>>>>>> Cc: openstack
> >>>>>>>>> Subject: Re: [Openstack] VM can't ping self floating IP after a
> >>>>>>>>> snapshot is taken
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thank you for your support.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I checked the file  nova/virt/libvirt/connection.py, the sentence
> >>>>>>>>> self._enable_hairpin(instance) is already added to the function
> >>>>>>>>> _hard_reboot().
> >>>>>>>>>
> >>>>>>>>> It looks like there are some difference between taking snapshot
> and
> >>>>>>>>> reboot instance. I tried to figure out how to fix this bug but
> >>>>>>>>> failed.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It will be much appreciated if anyone can give some hints.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>>
> >>>>>>>>> Sam
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, Jul 19, 2012 at 8:37 AM, Brian Haley <brian.haley@xxxxxx
> >
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> On 07/17/2012 05:56 PM, Sam Su wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> Just This always happens in Essex release. After I take a
> snapshot
> >>>>>>>>>> of
> >>>>>>>>>> my VM ( I
> >>>>>>>>>> tried Ubuntu 12.04 or CentOS 5.8), VM can't ping its self
> floating
> >>>>>>>>>> IP; before I
> >>>>>>>>>> take a snapshot though, VM can ping its self floating IP.
> >>>>>>>>>>
> >>>>>>>>>> This looks closely related to
> >>>>>>>>>> https://bugs.launchpad.net/nova/+bug/933640, but
> >>>>>>>>>> still a little different. In 933640, it sounds like VM can't
> ping
> >>>>>>>>>> its
> >>>>>>>>>> self
> >>>>>>>>>> floating IP regardless whether we take a snapshot or not.
> >>>>>>>>>>
> >>>>>>>>>> Any suggestion to make an easy fix? And what is the root cause
> of
> >>>>>>>>>> the
> >>>>>>>>>> problem?
> >>>>>>>>>
> >>>>>>>>> It might be because there's a missing _enable_hairpin() call in
> the
> >>>>>>>>> reboot()
> >>>>>>>>> function.  Try something like this...
> >>>>>>>>>
> >>>>>>>>> nova/virt/libvirt/connection.py, _hard_reboot():
> >>>>>>>>>
> >>>>>>>>>             self._create_new_domain(xml)
> >>>>>>>>> +            self._enable_hairpin(instance)
> >>>>>>>>>             self.firewall_driver.apply_instance_filter(instance,
> >>>>>>>>> network_info)
> >>>>>>>>>
> >>>>>>>>> At least that's what I remember doing myself recently when
> testing
> >>>>>>>>> after a
> >>>>>>>>> reboot, don't know about snapshot.
> >>>>>>>>>
> >>>>>>>>> Folsom has changed enough that something different would need to
> be
> >>>>>>>>> done there.
> >>>>>>>>>
> >>>>>>>>> -Brian
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Mailing list: https://launchpad.net/~openstack
> >>>>>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> >>>>>>>> Unsubscribe : https://launchpad.net/~openstack
> >>>>>>>> More help   : https://help.launchpad.net/ListHelp
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Mailing list: https://launchpad.net/~openstack
> >>>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> >>>>>> Unsubscribe : https://launchpad.net/~openstack
> >>>>>> More help   : https://help.launchpad.net/ListHelp
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Mailing list: https://launchpad.net/~openstack
> >>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> >>>>> Unsubscribe : https://launchpad.net/~openstack
> >>>>> More help   : https://help.launchpad.net/ListHelp
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Mailing list: https://launchpad.net/~openstack
> >>>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> >>>>> Unsubscribe : https://launchpad.net/~openstack
> >>>>> More help   : https://help.launchpad.net/ListHelp
> >>>>>
> >>>>
> >>>> _______________________________________________
> >>>> Mailing list: https://launchpad.net/~openstack
> >>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> >>>> Unsubscribe : https://launchpad.net/~openstack
> >>>> More help   : https://help.launchpad.net/ListHelp
> >>>
> >>>
>

Follow ups

References