← Back to team overview

fuel-dev team mailing list archive

Re: Nodes discovering mechanism in nailgun (nailgun agent)

 

Vladimir,

Thank you for extended answer, this is interesting information.

--
Best regards,
Oleg Gelbukh


On Thu, Mar 20, 2014 at 2:25 PM, Vladimir Kozhukalov <
vkozhukalov@xxxxxxxxxxxx> wrote:

> Oleg,
>
> As far as I know, nobody from Ironic core team does not think that
> discovery is not in scope of Ironic. Just to be sure, we discussed this
> topic yesterday with R. Prykhodchenko tete-a-tete and then with Devananda
> and others in #openstack-ironic. This blueprint
> https://blueprints.launchpad.net/ironic/+spec/discovery-ramdisk which is
> about discovery has been postponded so as just to follow nova baremetal
> driver compatibility. But it has not been canceled.
>
> And yes, it is supposed that user can have their own CMDB and they must be
> able to add their nodes into Ironic via ir-api, but this fact does not
> restrict Ironic's ability to discover nodes. Right now we (A. Gordeev and
> I) are working together with some Rackspace guys on a generic pluggable
> Ironic agent which is supposed to able to do a variety of tasks such as OS
> provisioning, node discovering, firmware updates, RAID configuring, etc.
> This agent is supposed to have REST API and to expose hardware info via
> this API. Discovering will follow the flow:
> 0) node boots via PXE and heartbeat url (where node sends it "I'am here
> and alive" requests) is passed via kernel parameter,
> 1) agent starts and sends "I'am here and alive" request,
> 2) conductor sends hardware info request to agent REST API,
>
> Nick,
>
> It's not supposed (however it is possible) that thousands of nodes will
> try to send http requests on nailgun API. List of discovered nodes and
> their state could be gotten from Ironic API. But it, of course, does not
> revoke the necessity of nailgun performance improvements.
>
> Vladimir Kozhukalov
>
>
> On Thu, Mar 20, 2014 at 12:27 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
>
>> Agree that it's important to improve nailgun performance and use uWSGI
>> server, but it will not solve the problem when thousands of nodes try to
>> register in nailgun, we have to create a lot of objects (nodes, interfaces
>> etc), for that we need to use separate service which will be able to
>> retrieve data from nodes and send just several nodes at the same time to
>> nailgun for registration.
>>
>>
>> On Wed, Mar 19, 2014 at 4:06 PM, Nikolay Markov <nmarkov@xxxxxxxxxxxx>wrote:
>>
>>> Hello all,
>>>
>>> The problem in current approach for discovering is in low performance
>>> of Nailgun app and database interaction which is not really effective.
>>> If we'll use the same code for registering new nodes in Nailgun DB and
>>> keepalive - we will still be experiencing some issues with its
>>> performance.
>>>
>>> I would start with these two steps without doing any serious changes:
>>>
>>> 1) Moving Nailgun from built-in Python server to Nginx+uWSGI (it's
>>> performance is being tested with a help from Igor Shishkin right now
>>> and uWSGI shows really good improvement).
>>> 2) Refactoring and optimizing DB queries using joinedloads and indexes
>>> and profiling code execution. Almost every fix possible here will be a
>>> huge improvement of RPS, because right now we're overloading DB with
>>> queries and some places really need code optimization.
>>>
>>> On Wed, Mar 19, 2014 at 2:20 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
>>> > Hi,
>>> >
>>> > Let me describe main points of the document:
>>> > 1. create a middleware service between nailgun and nodes (for
>>> discovering
>>> > and online/offline status monitoring)
>>> > 2. remove from agents ability to make requests directly to nailgun,
>>> instead
>>> > we want to request data from nodes when we need it
>>> >
>>> > This approach is very similar to what Vladimir described. But in my
>>> doc I
>>> > described the solution with mcollective, because we already have it
>>> and it
>>> > works. In fact there can be any other transport.
>>> >
>>> > I have several question about Ironic solution:
>>> > 1. when (roughly speaking) agent in Ironic will be ready?
>>> > 2. do we want to make this system via mcollective and then replace
>>> with http
>>> > based solution from ironic?
>>> > 3. how are you going to update data in nailgun if interface or disk was
>>> > added/removed to/from node?
>>> >
>>> > Thanks,
>>> >
>>> >
>>> > On Wed, Mar 19, 2014 at 12:07 PM, Oleg Gelbukh <ogelbukh@xxxxxxxxxxxx>
>>> > wrote:
>>> >>
>>> >> Vladimir,
>>> >>
>>> >> I might be wrong, but I heard directly from Devananda that Ironic
>>> don't
>>> >> plan to have Discovery as a part of it's scope. Things might have
>>> changed
>>> >> since then (it was at HK summit), but general idea was that Ironic
>>> won't
>>> >> serve as hosts directory or CMDB, and nodes will be enrolled to it
>>> from some
>>> >> external source.
>>> >>
>>> >> However, I think it is natural that discovery capabilities should be
>>> >> supported by a unified agent used by Ironic and hypothetical Discovery
>>> >> service (e.g. Nailgun).
>>> >>
>>> >> --
>>> >> Best regards,
>>> >> Oleg
>>> >>
>>> >>
>>> >> On Wed, Mar 19, 2014 at 11:49 AM, Vladimir Kozhukalov
>>> >> <vkozhukalov@xxxxxxxxxxxx> wrote:
>>> >>>
>>> >>> My suggestion is to stop inventing discovering mechanism on our own.
>>> >>> Openstack is supposed to use Ironic for provisioning, discovering,
>>> firmware
>>> >>> updates, RAID configuring, power management. In Ironic project there
>>> is a
>>> >>> blueprint for utility ramdisk (it is similar to Fuel bootstrap)
>>> >>> https://blueprints.launchpad.net/ironic/+spec/utility-ramdisk. Our
>>> current
>>> >>> activities in substituting Cobbler with Ironic include contributing
>>> in
>>> >>> python ironic agent
>>> https://wiki.openstack.org/wiki/Ironic-python-agent. We
>>> >>> discussed the general architecture of this agent and agreed that it
>>> should
>>> >>> expose REST API and every piece of its functionality needs to be
>>> implemented
>>> >>> as pluggable driver.
>>> >>>
>>> >>> Discovery flow could be implemented as a series of http requests to
>>> these
>>> >>> agents running on nodes. Discovery will be just a part of full
>>> functionality
>>> >>> of these agents. The list of IP addresses where we need to send
>>> discovery
>>> >>> requests could be known from the list of  leased addresses from DHCP
>>> server.
>>> >>>
>>> >>>
>>> >>>
>>> >>> Vladimir Kozhukalov
>>> >>>
>>> >>>
>>> >>> On Tue, Mar 18, 2014 at 4:25 PM, Mike Scherbakov
>>> >>> <mscherbakov@xxxxxxxxxxxx> wrote:
>>> >>>>
>>> >>>> Looks like it's still open question.
>>> >>>> Andrew, can you respond please on Eugene's question in the doc?
>>> >>>>
>>> >>>> My personal opinion: refactor the current approach in the way so
>>> it's
>>> >>>> more performant (reduce amount of data), as it will be required
>>> anyway. See
>>> >>>> how it works. If we still have issues, go further, perhaps with
>>> >>>> re-implementation to use polling of servers instead, whether using
>>> tiny REST
>>> >>>> services on nodes or AMQP or anything else.
>>> >>>>
>>> >>>> Basically, let's eliminate issues step by step.
>>> >>>> Thanks,
>>> >>>>
>>> >>>>
>>> >>>> On Mon, Mar 3, 2014 at 12:46 PM, Evgeniy L <eli@xxxxxxxxxxxx>
>>> wrote:
>>> >>>>>
>>> >>>>> Hi,
>>> >>>>>
>>> >>>>> We had a discussion about nailgun agent which some of us want to
>>> >>>>> rewrite in python, I don't think that we need to rewrite nailgun
>>> agent
>>> >>>>> one-to-one to solve a single problem.
>>> >>>>> I tried to describe problems which we have and how we can solve
>>> them.
>>> >>>>> [0]
>>> >>>>>
>>> >>>>> Comments are welcome.
>>> >>>>>
>>> >>>>> [0]
>>> >>>>>
>>> https://docs.google.com/a/mirantis.com/document/d/1zqV58LZBLQ-0gllb_i3MyIKIMj-Qx8ELJohjcWs459s/edit#
>>> >>>>>
>>> >>>>> --
>>> >>>>> Mailing list: https://launchpad.net/~fuel-dev
>>> >>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> >>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>> >>>>> More help   : https://help.launchpad.net/ListHelp
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Mike Scherbakov
>>> >>>> #mihgen
>>> >>>>
>>> >>>> --
>>> >>>> Mailing list: https://launchpad.net/~fuel-dev
>>> >>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> >>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>> >>>> More help   : https://help.launchpad.net/ListHelp
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Mailing list: https://launchpad.net/~fuel-dev
>>> >>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> >>> Unsubscribe : https://launchpad.net/~fuel-dev
>>> >>> More help   : https://help.launchpad.net/ListHelp
>>> >>>
>>> >>
>>> >
>>> >
>>> > --
>>> > Mailing list: https://launchpad.net/~fuel-dev
>>> > Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> > Unsubscribe : https://launchpad.net/~fuel-dev
>>> > More help   : https://help.launchpad.net/ListHelp
>>> >
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Nick Markov
>>>
>>> --
>>> Mailing list: https://launchpad.net/~fuel-dev
>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>> --
>> Mailing list: https://launchpad.net/~fuel-dev
>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~fuel-dev
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
> --
> Mailing list: https://launchpad.net/~fuel-dev
> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fuel-dev
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References