← Back to team overview

openstack team mailing list archive

Re: instances loosing IP address while running, due to No DHCPOFFER

 

I vaguely recall Vish mentioning a bug in dnsmasq that had a somewhat
similar problem. (it had to do with lease renewal problems on ip
aliases or something like that).

This issue was particularly pronounced with windows VMs, apparently.
 -nld

On Thu, Jun 14, 2012 at 6:02 PM, Christian Parpart <trapni@xxxxxxxxx> wrote:
> Hey,
>
> thanks for your reply. Unfortunately there was no process restart in
> nova-network nor in dnsmasq,
> both processes seem to have been up for about 2 and 3 days.
>
> However, why is the default dhcp_lease_time value equal 120s? Not having
> this one overridden
> causes the clients to actually re-acquire a new DHCP lease every 42 seconds
> (at least on my nodes),
> which is completely ridiculous.
> OTOH, I took a look at the sources (linux_net.py) and found out, why the
> max_lease_time is
> set to 2048, because that is the size of my network.
> So why is the max lease time the size of my network?
> I've written a tiny patch to allow overriding this value in nova.conf, and
> will submit it to launchpad
> soon - and hope it'll be accepted and then also applied to essex, since this
> is a very straight forward
> few-liner helpful thing.
>
> Nevertheless, that does not clarify on why now I had 2 (well, 3 actually)
> instances getting
> no DHCP replies/offers after some hours/days anymore.
>
> The one host that caused issues today (a few hours ago), I fixed it by hard
> rebooting the instance,
> however, just about 40 minutes later, it again forgot its IP, so one might
> say, that it
> maybe did not get any reply from the dhcp server (dnsmasq) almost right
> after it got
> a lease on instance boot.
>
> So long,
> Christian.
>
> On Thu, Jun 14, 2012 at 10:55 PM, Nathanael Burton
> <nathanael.i.burton@xxxxxxxxx> wrote:
>>
>> Has nova-network been restarted? There was an issue where nova-network was
>> signalling dnsmasq which would cause dnsmasq to stop responding to requests
>> yet appear to be running fine.
>>
>> You can see if killing dnsmasq, restarting nova-network, and rebooting an
>> instance allows it to get a dhcp address again ...
>>
>> Nate
>>
>> On Jun 14, 2012 4:46 PM, "Christian Parpart" <trapni@xxxxxxxxx> wrote:
>>>
>>> Hey all,
>>>
>>> I feel really sad with saying this, now, that we have quite a few
>>> instances in producgtion
>>> since about 5 days at least, I now have encountered the second instance
>>> loosing its
>>> IP address due to "No DHCPOFFER" (as of syslog in the instance).
>>>
>>> I checked the logs in the central nova-network and gateway node and found
>>> dnsmasq still to reply on requests from all the other instances and it
>>> even
>>> got the request from the instance in question and even sent an OFFER, as
>>> of what
>>> I can tell by now (i'm investigating / posting logs asap), but while it
>>> seemed
>>> that the dnsmasq sends an offer, the instances says it didn't receive one
>>> - wtf?
>>>
>>> Please tell me what I can do to actually *fix* this issue, since this is
>>> by far very fatal.
>>>
>>> One chance I'd see (as a workaround) is, to let created instanced
>>> retrieve
>>> its IP via dhcp, but then reconfigure /etc/network/instances to continue
>>> with
>>> static networking setup. However, I'd just like the dhcp thingy to get
>>> fixed.
>>>
>>> I'm very open to any kind of helping comments, :)
>>>
>>> So long,
>>> Christian.
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>


Follow ups

References