openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #16026
Re: [openstack-dev] Discussion about where to put database for bare-metal provisioning (review 10726)
To elaborate, something the below. I'm not absolutely sure you need to be able to set service_name and host, but this gives you the option to do so if needed.
iff --git a/nova/manager.py b/nova/manager.py
index c6711aa..c0f4669 100644
--- a/nova/manager.py
+++ b/nova/manager.py
@@ -217,6 +217,8 @@ class SchedulerDependentManager(Manager):
def update_service_capabilities(self, capabilities):
"""Remember these capabilities to send on next periodic update."""
+ if not isinstance(capabilities, list):
+ capabilities = [capabilities]
self.last_capabilities = capabilities
@periodic_task
@@ -224,5 +226,8 @@ class SchedulerDependentManager(Manager):
"""Pass data back to the scheduler at a periodic interval."""
if self.last_capabilities:
LOG.debug(_('Notifying Schedulers of capabilities ...'))
- self.scheduler_rpcapi.update_service_capabilities(context,
- self.service_name, self.host, self.last_capabilities)
+ for capability_item in self.last_capabilities:
+ name = capability_item.get('service_name', self.service_name)
+ host = capability_item.get('host', self.host)
+ self.scheduler_rpcapi.update_service_capabilities(context,
+ name, host, capability_item)
On Aug 21, 2012, at 1:28 PM, David Kang <dkang@xxxxxxx> wrote:
>
> Hi Vish,
>
> We are trying to change our code according to your comment.
> I want to ask a question.
>
>>>> a) modify driver.get_host_stats to be able to return a list of host
>>>> stats instead of just one. Report the whole list back to the
>>>> scheduler. We could modify the receiving end to accept a list as
>>>> well
>>>> or just make multiple calls to
>>>> self.update_service_capabilities(capabilities)
>
> Modifying driver.get_host_stats to return a list of host stats is easy.
> Calling muliple calls to self.update_service_capabilities(capabilities) doesn't seem to work,
> because 'capabilities' is overwritten each time.
>
> Modifying the receiving end to accept a list seems to be easy.
> However, 'capabilities' is assumed to be dictionary by all other scheduler routines,
> it looks like that we have to change all of them to handle 'capability' as a list of dictionary.
>
> If my understanding is correct, it would affect many parts of the scheduler.
> Is it what you recommended?
>
> Thanks,
> David
>
>
> ----- Original Message -----
>> This was an immediate goal, the bare-metal nova-compute node could
>> keep an internal database, but report capabilities through nova in the
>> common way with the changes below. Then the scheduler wouldn't need
>> access to the bare metal database at all.
>>
>> On Aug 15, 2012, at 4:23 PM, David Kang <dkang@xxxxxxx> wrote:
>>
>>>
>>> Hi Vish,
>>>
>>> Is this discussion for long-term goal or for this Folsom release?
>>>
>>> We still believe that bare-metal database is needed
>>> because there is not an automated way how bare-metal nodes report
>>> their capabilities
>>> to their bare-metal nova-compute node.
>>>
>>> Thanks,
>>> David
>>>
>>>>
>>>> I am interested in finding a solution that enables bare-metal and
>>>> virtualized requests to be serviced through the same scheduler
>>>> where
>>>> the compute_nodes table has a full view of schedulable resources.
>>>> This
>>>> would seem to simplify the end-to-end flow while opening up some
>>>> additional use cases (e.g. dynamic allocation of a node from
>>>> bare-metal to hypervisor and back).
>>>>
>>>> One approach would be to have a proxy running a single nova-compute
>>>> daemon fronting the bare-metal nodes . That nova-compute daemon
>>>> would
>>>> report up many HostState objects (1 per bare-metal node) to become
>>>> entries in the compute_nodes table and accessible through the
>>>> scheduler HostManager object.
>>>>
>>>>
>>>>
>>>>
>>>> The HostState object would set cpu_info, vcpus, member_mb and
>>>> local_gb
>>>> values to be used for scheduling with the hypervisor_host field
>>>> holding the bare-metal machine address (e.g. for IPMI based
>>>> commands)
>>>> and hypervisor_type = NONE. The bare-metal Flavors are created with
>>>> an
>>>> extra_spec of hypervisor_type= NONE and the corresponding
>>>> compute_capabilities_filter would reduce the available hosts to
>>>> those
>>>> bare_metal nodes. The scheduler would need to understand that
>>>> hypervisor_type = NONE means you need an exact fit (or best-fit)
>>>> host
>>>> vs weighting them (perhaps through the multi-scheduler). The
>>>> scheduler
>>>> would cast out the message to the <topic>.<service-hostname> (code
>>>> today uses the HostState hostname), with the compute driver having
>>>> to
>>>> understand if it must be serviced elsewhere (but does not break any
>>>> existing implementations since it is 1 to 1).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Does this solution seem workable? Anything I missed?
>>>>
>>>> The bare metal driver already is proxying for the other nodes so it
>>>> sounds like we need a couple of things to make this happen:
>>>>
>>>>
>>>> a) modify driver.get_host_stats to be able to return a list of host
>>>> stats instead of just one. Report the whole list back to the
>>>> scheduler. We could modify the receiving end to accept a list as
>>>> well
>>>> or just make multiple calls to
>>>> self.update_service_capabilities(capabilities)
>>>>
>>>>
>>>> b) make a few minor changes to the scheduler to make sure filtering
>>>> still works. Note the changes here may be very helpful:
>>>>
>>>>
>>>> https://review.openstack.org/10327
>>>>
>>>>
>>>> c) we have to make sure that instances launched on those nodes take
>>>> up
>>>> the entire host state somehow. We could probably do this by making
>>>> sure that the instance_type ram, mb, gb etc. matches what the node
>>>> has, but we may want a new boolean field "used" if those aren't
>>>> sufficient.
>>>>
>>>>
>>>> I This approach seems pretty good. We could potentially get rid of
>>>> the
>>>> shared bare_metal_node table. I guess the only other concern is how
>>>> you populate the capabilities that the bare metal nodes are
>>>> reporting.
>>>> I guess an api extension that rpcs to a baremetal node to add the
>>>> node. Maybe someday this could be autogenerated by the bare metal
>>>> host
>>>> looking in its arp table for dhcp requests! :)
>>>>
>>>>
>>>> Vish
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev@xxxxxxxxxxxxxxxxxxx
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@xxxxxxxxxxxxxxxxxxx
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@xxxxxxxxxxxxxxxxxxx
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Follow ups
References