← Back to team overview

savanna-all team mailing list archive

Re: Cluster scaling discussion

 

Sounds good so for cluster scaling-
Assuming the hadoop cluster already exists-
Adding new compute node should be a separate operation by itself. So user
says add compute node and picks the flavor. The compute node service
configuration should reflect the config of other compute nodes in the
cluster.no restrictions on where the compute node can be placed but user
should be allowed to pick the hypervisor physical node on which the new
compute node vm needs to be spun up. Bare in mind that this process will
vary a bit with yarn as with YARN you will be adding new node managers
(equivalent of task trackers)

Adding a new datanode should be a separate operation as well. It will need
cluster rebalancing. User should be allowed to specify if they also want to
include task tracker (yes by default) on the datanode vm. Two data node VMs
that belong to the same cluster should not be provisioned to the same
hypervisor node. User should  be allowed to pick flavor and storage type
for the node but rest of the config can mirror other data and compute nodes
in cluster.

Himanshu
On Jun 28, 2013 6:05 AM, "Nadezhda Privalova" <nprivalova@xxxxxxxxxxxx>
wrote:

> Himanshu,
>
> I thought that adding/removing services is important feature but now I'd
> like to discuss new instances addition to cluster. Not cluster editing.
>
> Nadya
>
>
> On Fri, Jun 28, 2013 at 4:48 PM, Himanshu Bari <hbari@xxxxxxxxxxxxxxx>wrote:
>
>> When using savanna to provision dev/test clusters customers need a way to
>> modify an existing cluster by adding or removing services running in that
>> cluster. These should not be limited to data or compute nodes. The
>> framework needs to be generic so the user can pick an existing hadoop
>> cluster and add a new node group with the required services.
>>
>> Himanshu
>> On Jun 28, 2013 5:37 AM, "Nadezhda Privalova" <nprivalova@xxxxxxxxxxxx>
>> wrote:
>>
>>> Hi all,
>>>
>>> Here are some our thoughts and ideas about cluster scaling feature in
>>> Savanna. All the following is only about datanode and tasktracker
>>> processes. We are not planning to support any other processes. Please, If
>>> you do not agree with it, share your thoughts.
>>>
>>>
>>> Now we are considering 2 scenarios:
>>>
>>> I. User scales existing cluster's node groups. It is rather simple
>>> feature because we may just copy all configs from existing instances of
>>> node group. From Hadoop's perspective there is only one additional step
>>> besides "start tasktracker/datanode": before tasktracker's start it is
>>> needed to rebalance cluster.
>>>     And in this scenario it is obviously to have an ability to delete
>>> instances from node group. And here I'm concerned. Datanode's decommission
>>> needs a lot of time for processing. So if user wants to delete one instance
>>> from "datanode" node group (this node group has only 'datanode' process)
>>> and add one instance to "tasktracker" node group the required time may be
>>> unacceptable. So I suppose that datanode decommission should be a separate
>>> process, not part of cluster scaling. What do you think about it?
>>>
>>> II User adds a new node group to cluster. Here Savanna repeats flow from
>>> cluster creation. Here we cannot copy the configs and need to create all
>>> *.xml config-files for Hadoop.
>>>
>>> As for REST, we propose to make request as follows:
>>>
>>>
>>>
>>>
>>> {
>>>     "resize_node_groups": [
>>>         {
>>>
>>>
>>>
>>>             "name": "storage",
>>>
>>>
>>>
>>>             "count": 10
>>>
>>>
>>>
>>>         },
>>>         {
>>>             "name": "worker",
>>>
>>>
>>>             "count": -1        <-----deletion
>>>
>>>
>>>         }
>>>
>>>
>>>
>>>     ],
>>>     "add_node_groups": [
>>>
>>>
>>>         {
>>>             "node_group_tmpl_id": "520ee6a2-c8f5-4c9b-86c4-fd273860ff8e"
>>>
>>>
>>>             "name": "new-node-group-name",
>>>
>>>
>>>             "node_processes": ["datanode", "jobtracker"],
>>>
>>>
>>>             "flavor_id": 42
>>>
>>>
>>>         }
>>>
>>>
>>>
>>>     ]
>>> }
>>>
>>>
>>>
>>> So we propose to add all the stuff in one PUT cluster's call.
>>>
>>> Please share your thoughts about it, because we're planning to implement scaling in the current phase.
>>>
>>>
>>>
>>>
>>>
>>> Best regards,
>>> Nadya
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Mailing list: https://launchpad.net/~savanna-all
>>> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~savanna-all
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>

References