yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #60643
[Bug 1650611] Re: dhcp agent reporting state as down during the initial sync
Reviewed: https://review.openstack.org/413010
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f15851b98974dc16606da195cf3ecee577cd0ef8
Submitter: Jenkins
Branch: master
commit f15851b98974dc16606da195cf3ecee577cd0ef8
Author: Bertrand Lallau <bertrand.lallau@xxxxxxxxxxxxxxx>
Date: Tue Dec 20 10:53:41 2016 +0100
DHCP: enhance DHCPAgent startup procedure
During DhcpAgent startup procedure all the following networks
initialization is actually perform twice:
* Killing old dnsmasq processes
* set and configure all TAP interfaces
* building all Dnsmasq config files (lease and host files)
* launching dnsmasq processes
What is done during the second iteration is just clean and redo
exactly the same another time! This is really inefficient and
increase dramatically DHCP startup time (near twice than needed).
Initialization process 'sync_state' method is called twice:
* one time during init_host()
* another time during _report_state()
sync_state() call must stay in init_host() due to bug #1420042.
sync_state() is always called during startup in init_host()
and will be periodically called by periodic_resync()
to do reconciliation.
Hence it can safely be removed from the run() method.
Change-Id: Id6433598d5c833d2e86be605089d42feee57c257
Closes-bug: #1651368
Closes-Bug: #1650611
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1650611
Title:
dhcp agent reporting state as down during the initial sync
Status in neutron:
Fix Released
Bug description:
When dhcp agent is started, neutron agent-list reports its state as
dead until the initial sync is complete.
This can lead to unwanted alarms in monitoring systems, especially in
large environments where the initial sync may take hours. During this
time, systemctl shows that the agent is actually alive while neutron
agent-list reports it as down.
Technical details:
If I'm right, this line [0] is the exact point where the initial sync
takes place right after the first state report (with start_flag=True)
is sent to the server. As it's being done in the same thread, it won't
send a second state report until it's done with the sync.
Doing it in a separate thread would let the heartbeat task to continue
sending state reports to the server but I don't know whether this have
any unwanted side effects.
[0] https://github.com/openstack/neutron/blob/master/neutron/agent/dhcp/agent.py#L751
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1650611/+subscriptions
References