openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #15882
Re: [openstack-dev] Discussion about where to put database for bare-metal provisioning (review 10726)
Hi Vish,
Is this discussion for long-term goal or for this Folsom release?
We still believe that bare-metal database is needed
because there is not an automated way how bare-metal nodes report their capabilities
to their bare-metal nova-compute node.
Thanks,
David
>
> I am interested in finding a solution that enables bare-metal and
> virtualized requests to be serviced through the same scheduler where
> the compute_nodes table has a full view of schedulable resources. This
> would seem to simplify the end-to-end flow while opening up some
> additional use cases (e.g. dynamic allocation of a node from
> bare-metal to hypervisor and back).
>
> One approach would be to have a proxy running a single nova-compute
> daemon fronting the bare-metal nodes . That nova-compute daemon would
> report up many HostState objects (1 per bare-metal node) to become
> entries in the compute_nodes table and accessible through the
> scheduler HostManager object.
>
>
>
>
> The HostState object would set cpu_info, vcpus, member_mb and local_gb
> values to be used for scheduling with the hypervisor_host field
> holding the bare-metal machine address (e.g. for IPMI based commands)
> and hypervisor_type = NONE. The bare-metal Flavors are created with an
> extra_spec of hypervisor_type= NONE and the corresponding
> compute_capabilities_filter would reduce the available hosts to those
> bare_metal nodes. The scheduler would need to understand that
> hypervisor_type = NONE means you need an exact fit (or best-fit) host
> vs weighting them (perhaps through the multi-scheduler). The scheduler
> would cast out the message to the <topic>.<service-hostname> (code
> today uses the HostState hostname), with the compute driver having to
> understand if it must be serviced elsewhere (but does not break any
> existing implementations since it is 1 to 1).
>
>
>
>
>
> Does this solution seem workable? Anything I missed?
>
> The bare metal driver already is proxying for the other nodes so it
> sounds like we need a couple of things to make this happen:
>
>
> a) modify driver.get_host_stats to be able to return a list of host
> stats instead of just one. Report the whole list back to the
> scheduler. We could modify the receiving end to accept a list as well
> or just make multiple calls to
> self.update_service_capabilities(capabilities)
>
>
> b) make a few minor changes to the scheduler to make sure filtering
> still works. Note the changes here may be very helpful:
>
>
> https://review.openstack.org/10327
>
>
> c) we have to make sure that instances launched on those nodes take up
> the entire host state somehow. We could probably do this by making
> sure that the instance_type ram, mb, gb etc. matches what the node
> has, but we may want a new boolean field "used" if those aren't
> sufficient.
>
>
> I This approach seems pretty good. We could potentially get rid of the
> shared bare_metal_node table. I guess the only other concern is how
> you populate the capabilities that the bare metal nodes are reporting.
> I guess an api extension that rpcs to a baremetal node to add the
> node. Maybe someday this could be autogenerated by the bare metal host
> looking in its arp table for dhcp requests! :)
>
>
> Vish
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@xxxxxxxxxxxxxxxxxxx
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Follow ups
References