← Back to team overview

savanna-all team mailing list archive

Re: Cluster scaling discussion

 

When using savanna to provision dev/test clusters customers need a way to
modify an existing cluster by adding or removing services running in that
cluster. These should not be limited to data or compute nodes. The
framework needs to be generic so the user can pick an existing hadoop
cluster and add a new node group with the required services.

Himanshu
On Jun 28, 2013 5:37 AM, "Nadezhda Privalova" <nprivalova@xxxxxxxxxxxx>
wrote:

> Hi all,
>
> Here are some our thoughts and ideas about cluster scaling feature in
> Savanna. All the following is only about datanode and tasktracker
> processes. We are not planning to support any other processes. Please, If
> you do not agree with it, share your thoughts.
>
>
> Now we are considering 2 scenarios:
>
> I. User scales existing cluster's node groups. It is rather simple feature
> because we may just copy all configs from existing instances of node group.
> From Hadoop's perspective there is only one additional step besides "start
> tasktracker/datanode": before tasktracker's start it is needed to rebalance
> cluster.
>     And in this scenario it is obviously to have an ability to delete
> instances from node group. And here I'm concerned. Datanode's decommission
> needs a lot of time for processing. So if user wants to delete one instance
> from "datanode" node group (this node group has only 'datanode' process)
> and add one instance to "tasktracker" node group the required time may be
> unacceptable. So I suppose that datanode decommission should be a separate
> process, not part of cluster scaling. What do you think about it?
>
> II User adds a new node group to cluster. Here Savanna repeats flow from
> cluster creation. Here we cannot copy the configs and need to create all
> *.xml config-files for Hadoop.
>
> As for REST, we propose to make request as follows:
>
>
> {
>     "resize_node_groups": [
>         {
>
>             "name": "storage",
>
>             "count": 10
>
>         },
>         {
>             "name": "worker",
>             "count": -1        <-----deletion
>         }
>
>     ],
>     "add_node_groups": [
>         {
>             "node_group_tmpl_id": "520ee6a2-c8f5-4c9b-86c4-fd273860ff8e"
>             "name": "new-node-group-name",
>             "node_processes": ["datanode", "jobtracker"],
>             "flavor_id": 42
>         }
>
>     ]
> }
>
> So we propose to add all the stuff in one PUT cluster's call.
>
> Please share your thoughts about it, because we're planning to implement scaling in the current phase.
>
> Best regards,
> Nadya
>
>
>
>
> --
> Mailing list: https://launchpad.net/~savanna-all
> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~savanna-all
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References