openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #13222
Re: instances loosing IP address while running, due to No DHCPOFFER
There's a flag 'dhcp_lease_time' (in secs) that can be set in nova.conf.
DHCP clients typically re-up every (dhcp_lease_time/2) seconds, but this
varies based on client. Additionally some dhcp clients are not persistent,
meaning if there's ever a network hiccup and they don't get a dhcp ACK they
will give up and stop checking in, thus losing their lease and fall off the
network.
On RHEL/CentOS/Fedora this is fixed by setting PERSISTENT_DHCLIENT=1 in
your ifcfg-eth0 file. Not sure about Ubuntu.
Nate
On Jun 14, 2012 7:02 PM, "Christian Parpart" <trapni@xxxxxxxxx> wrote:
> Hey,
>
> thanks for your reply. Unfortunately there was no process restart in
> nova-network nor in dnsmasq,
> both processes seem to have been up for about 2 and 3 days.
>
> However, why is the default dhcp_lease_time value equal 120s? Not having
> this one overridden
> causes the clients to actually re-acquire a new DHCP lease every 42
> seconds (at least on my nodes),
> which is completely ridiculous.
> OTOH, I took a look at the sources (linux_net.py) and found out, why the
> max_lease_time is
> set to 2048, because that is the size of my network.
> So why is the max lease time the size of my network?
> I've written a tiny patch to allow overriding this value in nova.conf, and
> will submit it to launchpad
> soon - and hope it'll be accepted and then also applied to essex, since
> this is a very straight forward
> few-liner helpful thing.
>
> Nevertheless, that does not clarify on why now I had 2 (well, 3 actually)
> instances getting
> no DHCP replies/offers after some hours/days anymore.
>
> The one host that caused issues today (a few hours ago), I fixed it by
> hard rebooting the instance,
> however, just about 40 minutes later, it again forgot its IP, so one might
> say, that it
> maybe did not get any reply from the dhcp server (dnsmasq) almost right
> after it got
> a lease on instance boot.
>
> So long,
> Christian.
>
> On Thu, Jun 14, 2012 at 10:55 PM, Nathanael Burton <
> nathanael.i.burton@xxxxxxxxx> wrote:
>
>> Has nova-network been restarted? There was an issue where nova-network
>> was signalling dnsmasq which would cause dnsmasq to stop responding to
>> requests yet appear to be running fine.
>>
>> You can see if killing dnsmasq, restarting nova-network, and rebooting an
>> instance allows it to get a dhcp address again ...
>>
>> Nate
>> On Jun 14, 2012 4:46 PM, "Christian Parpart" <trapni@xxxxxxxxx> wrote:
>>
>>> Hey all,
>>>
>>> I feel really sad with saying this, now, that we have quite a few
>>> instances in producgtion
>>> since about 5 days at least, I now have encountered the second instance
>>> loosing its
>>> IP address due to "No DHCPOFFER" (as of syslog in the instance).
>>>
>>> I checked the logs in the central nova-network and gateway node and found
>>> dnsmasq still to reply on requests from all the other instances and it
>>> even
>>> got the request from the instance in question and even sent an OFFER, as
>>> of what
>>> I can tell by now (i'm investigating / posting logs asap), but while it
>>> seemed
>>> that the dnsmasq sends an offer, the instances says it didn't receive
>>> one - wtf?
>>>
>>> Please tell me what I can do to actually *fix* this issue, since this is
>>> by far very fatal.
>>>
>>> One chance I'd see (as a workaround) is, to let created instanced
>>> retrieve
>>> its IP via dhcp, but then reconfigure /etc/network/instances to continue
>>> with
>>> static networking setup. However, I'd just like the dhcp thingy to get
>>> fixed.
>>>
>>> I'm very open to any kind of helping comments, :)
>>>
>>> So long,
>>> Christian.
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to : openstack@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>>
>
References