← Back to team overview

fuel-dev team mailing list archive

Re: Nodes discovering mechanism in nailgun (nailgun agent)

 

Oleg,

As far as I know, nobody from Ironic core team does not think that
discovery is not in scope of Ironic. Just to be sure, we discussed this
topic yesterday with R. Prykhodchenko tete-a-tete and then with Devananda
and others in #openstack-ironic. This blueprint
https://blueprints.launchpad.net/ironic/+spec/discovery-ramdisk which is
about discovery has been postponded so as just to follow nova baremetal
driver compatibility. But it has not been canceled.

And yes, it is supposed that user can have their own CMDB and they must be
able to add their nodes into Ironic via ir-api, but this fact does not
restrict Ironic's ability to discover nodes. Right now we (A. Gordeev and
I) are working together with some Rackspace guys on a generic pluggable
Ironic agent which is supposed to able to do a variety of tasks such as OS
provisioning, node discovering, firmware updates, RAID configuring, etc.
This agent is supposed to have REST API and to expose hardware info via
this API. Discovering will follow the flow:
0) node boots via PXE and heartbeat url (where node sends it "I'am here and
alive" requests) is passed via kernel parameter,
1) agent starts and sends "I'am here and alive" request,
2) conductor sends hardware info request to agent REST API,

Nick,

It's not supposed (however it is possible) that thousands of nodes will try
to send http requests on nailgun API. List of discovered nodes and their
state could be gotten from Ironic API. But it, of course, does not revoke
the necessity of nailgun performance improvements.

Vladimir Kozhukalov


On Thu, Mar 20, 2014 at 12:27 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:

> Agree that it's important to improve nailgun performance and use uWSGI
> server, but it will not solve the problem when thousands of nodes try to
> register in nailgun, we have to create a lot of objects (nodes, interfaces
> etc), for that we need to use separate service which will be able to
> retrieve data from nodes and send just several nodes at the same time to
> nailgun for registration.
>
>
> On Wed, Mar 19, 2014 at 4:06 PM, Nikolay Markov <nmarkov@xxxxxxxxxxxx>wrote:
>
>> Hello all,
>>
>> The problem in current approach for discovering is in low performance
>> of Nailgun app and database interaction which is not really effective.
>> If we'll use the same code for registering new nodes in Nailgun DB and
>> keepalive - we will still be experiencing some issues with its
>> performance.
>>
>> I would start with these two steps without doing any serious changes:
>>
>> 1) Moving Nailgun from built-in Python server to Nginx+uWSGI (it's
>> performance is being tested with a help from Igor Shishkin right now
>> and uWSGI shows really good improvement).
>> 2) Refactoring and optimizing DB queries using joinedloads and indexes
>> and profiling code execution. Almost every fix possible here will be a
>> huge improvement of RPS, because right now we're overloading DB with
>> queries and some places really need code optimization.
>>
>> On Wed, Mar 19, 2014 at 2:20 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
>> > Hi,
>> >
>> > Let me describe main points of the document:
>> > 1. create a middleware service between nailgun and nodes (for
>> discovering
>> > and online/offline status monitoring)
>> > 2. remove from agents ability to make requests directly to nailgun,
>> instead
>> > we want to request data from nodes when we need it
>> >
>> > This approach is very similar to what Vladimir described. But in my doc
>> I
>> > described the solution with mcollective, because we already have it and
>> it
>> > works. In fact there can be any other transport.
>> >
>> > I have several question about Ironic solution:
>> > 1. when (roughly speaking) agent in Ironic will be ready?
>> > 2. do we want to make this system via mcollective and then replace with
>> http
>> > based solution from ironic?
>> > 3. how are you going to update data in nailgun if interface or disk was
>> > added/removed to/from node?
>> >
>> > Thanks,
>> >
>> >
>> > On Wed, Mar 19, 2014 at 12:07 PM, Oleg Gelbukh <ogelbukh@xxxxxxxxxxxx>
>> > wrote:
>> >>
>> >> Vladimir,
>> >>
>> >> I might be wrong, but I heard directly from Devananda that Ironic don't
>> >> plan to have Discovery as a part of it's scope. Things might have
>> changed
>> >> since then (it was at HK summit), but general idea was that Ironic
>> won't
>> >> serve as hosts directory or CMDB, and nodes will be enrolled to it
>> from some
>> >> external source.
>> >>
>> >> However, I think it is natural that discovery capabilities should be
>> >> supported by a unified agent used by Ironic and hypothetical Discovery
>> >> service (e.g. Nailgun).
>> >>
>> >> --
>> >> Best regards,
>> >> Oleg
>> >>
>> >>
>> >> On Wed, Mar 19, 2014 at 11:49 AM, Vladimir Kozhukalov
>> >> <vkozhukalov@xxxxxxxxxxxx> wrote:
>> >>>
>> >>> My suggestion is to stop inventing discovering mechanism on our own.
>> >>> Openstack is supposed to use Ironic for provisioning, discovering,
>> firmware
>> >>> updates, RAID configuring, power management. In Ironic project there
>> is a
>> >>> blueprint for utility ramdisk (it is similar to Fuel bootstrap)
>> >>> https://blueprints.launchpad.net/ironic/+spec/utility-ramdisk. Our
>> current
>> >>> activities in substituting Cobbler with Ironic include contributing in
>> >>> python ironic agent
>> https://wiki.openstack.org/wiki/Ironic-python-agent. We
>> >>> discussed the general architecture of this agent and agreed that it
>> should
>> >>> expose REST API and every piece of its functionality needs to be
>> implemented
>> >>> as pluggable driver.
>> >>>
>> >>> Discovery flow could be implemented as a series of http requests to
>> these
>> >>> agents running on nodes. Discovery will be just a part of full
>> functionality
>> >>> of these agents. The list of IP addresses where we need to send
>> discovery
>> >>> requests could be known from the list of  leased addresses from DHCP
>> server.
>> >>>
>> >>>
>> >>>
>> >>> Vladimir Kozhukalov
>> >>>
>> >>>
>> >>> On Tue, Mar 18, 2014 at 4:25 PM, Mike Scherbakov
>> >>> <mscherbakov@xxxxxxxxxxxx> wrote:
>> >>>>
>> >>>> Looks like it's still open question.
>> >>>> Andrew, can you respond please on Eugene's question in the doc?
>> >>>>
>> >>>> My personal opinion: refactor the current approach in the way so it's
>> >>>> more performant (reduce amount of data), as it will be required
>> anyway. See
>> >>>> how it works. If we still have issues, go further, perhaps with
>> >>>> re-implementation to use polling of servers instead, whether using
>> tiny REST
>> >>>> services on nodes or AMQP or anything else.
>> >>>>
>> >>>> Basically, let's eliminate issues step by step.
>> >>>> Thanks,
>> >>>>
>> >>>>
>> >>>> On Mon, Mar 3, 2014 at 12:46 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
>> >>>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> We had a discussion about nailgun agent which some of us want to
>> >>>>> rewrite in python, I don't think that we need to rewrite nailgun
>> agent
>> >>>>> one-to-one to solve a single problem.
>> >>>>> I tried to describe problems which we have and how we can solve
>> them.
>> >>>>> [0]
>> >>>>>
>> >>>>> Comments are welcome.
>> >>>>>
>> >>>>> [0]
>> >>>>>
>> https://docs.google.com/a/mirantis.com/document/d/1zqV58LZBLQ-0gllb_i3MyIKIMj-Qx8ELJohjcWs459s/edit#
>> >>>>>
>> >>>>> --
>> >>>>> Mailing list: https://launchpad.net/~fuel-dev
>> >>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> >>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>> >>>>> More help   : https://help.launchpad.net/ListHelp
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Mike Scherbakov
>> >>>> #mihgen
>> >>>>
>> >>>> --
>> >>>> Mailing list: https://launchpad.net/~fuel-dev
>> >>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> >>>> Unsubscribe : https://launchpad.net/~fuel-dev
>> >>>> More help   : https://help.launchpad.net/ListHelp
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Mailing list: https://launchpad.net/~fuel-dev
>> >>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> >>> Unsubscribe : https://launchpad.net/~fuel-dev
>> >>> More help   : https://help.launchpad.net/ListHelp
>> >>>
>> >>
>> >
>> >
>> > --
>> > Mailing list: https://launchpad.net/~fuel-dev
>> > Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> > Unsubscribe : https://launchpad.net/~fuel-dev
>> > More help   : https://help.launchpad.net/ListHelp
>> >
>>
>>
>>
>> --
>> Best regards,
>> Nick Markov
>>
>> --
>> Mailing list: https://launchpad.net/~fuel-dev
>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~fuel-dev
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
> --
> Mailing list: https://launchpad.net/~fuel-dev
> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fuel-dev
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References