← Back to team overview

savanna-all team mailing list archive

Re: Some questions regarding configuration

 

*We have the following workflow. User creates a specification for cluster
containing a Hadoop and VM specifications. On the first step, Savanna calls
plugin to verify the correctness. On the next step,  it passes user input
to plugin, and plugin returns a description of VMs for cluster. Savanna
based on this specification create a VMs and pass their description to the
plugin. The difference is that plugin can add some parameters to VMs
configuration. In VMs description for cluster creation,  we will add
 parameters for node type.*


On Wed, May 8, 2013 at 7:45 PM, Jon Maron <jmaron@xxxxxxxxxxxxxxx> wrote:

> Great!  We look forward to the documentation updates.  It would probably
> be beneficial if you could update the mockups to reflect these new concepts
> (e.g. I think a user should have a view that allows the mapping of node
> group to flavor, template, and number of instances)
>
> We should also be able to provide our take on the advanced configuration
> option in the next day or so.
>
> I have another question regarding cluster configuration:
>
> When a user selects the node group, flavor, template and number of
> instances and clicks on "Launch Cluster", the controller first invokes the
> VM provisioning element to create the VM instances.  The information
> returned is provided in the vm_specs object.  However, there doesn't appear
> to be any indication of the the node group association in the vm_specs.  In
> other words, there is nothing in the vm_specs or its associated attributes
> that indicates the node group (or node type) so that the cluster provider
> can subsequently select the appropriate VM to provision with the correct
> node type.  Am I missing some correlation?  It seems to me that the server
> instances should possibly specify the node_group they were provisioned for?
>
> -- Jon
>
> On May 8, 2013, at 10:54 AM, Alexander Kuznetsov <akuznetsov@xxxxxxxxxxxx>
> wrote:
>
> Jon,
>
> You should familiarize with the concept of templates (
> https://blueprints.launchpad.net/savanna/+spec/hierarchical-templates).
> They simplify the process of cluster configuration. In the general case,
> user chose templates and starts Hadoop cluster from one click from it. All
> configuration parameters for cluster will be contained in them. Your
> concerns about fast increasing number of nodes types are correct, and your
> interface describing the interaction between plugin and Savanna is great.
>
>
> I can suggest following structure for node templates:
>
> {
>    flavor: m1.tiny,
>    node_group: slaves,
>    components: [
>         {
>             name: ”data node”,
>             config: configuration_parameter_map,
>         },
>         {
>             name: ”task tracker”,
>             config: configuration_parameter_map,
>         }
>    ]
> }
>
>
> Methods get_supported_node_groups(), get_supported_components
> (node_group), get_configs(component) should be implemented on the Hadoop
> provider side. Method create_node_group(name, components[]) is not need and
> will be covered with changes in templates mechanism and will be in Savanna
> side.
>
> We will update documentation on Monday.
>
>
>
> Alexander Kuznetsov.
>
>
> On Wed, May 8, 2013 at 1:59 AM, Jon Maron <jmaron@xxxxxxxxxxxxxxx> wrote:
>
>> Hi,
>>
>>  As we understand it, the current configuration approach has the
>> following key APIs:
>>
>>  1)  Plugins return the set of node types via the
>> get_supported_node_types() call.  The return value is a list of strings
>> that describe the current set of node types supported by this plugin.
>>  2)  Plugins return the set of configuration items they support via the
>> get_configs() API call.  The configuration items currently appear to be
>> mapped to components(?).
>>  3)  The cluster descriptions used during validation (validate_cluster())
>> and cluster launch (configure_cluster(), start_cluster()) take the
>> following arguments:
>> - cluster_description:
>>  - cluster_name
>>  - cluster_configs
>>  - hadoop_version
>>  - vm_groups
>>
>>  where vm_groups is a list of vm_group instances.  vm_group has the
>> following attributes:
>>   - node_type
>>  - flavor
>>   - configs
>>  - count
>>
>>  So, the sequence of configuration-associated steps (and the* *issues/questions
>> we see with each) during cluster provisioning are (assuming no pre-existing
>> node template):
>>
>> - user selects a cluster name
>>  - user selects a plugin
>> - controller queries for supported hadoop versions from selected plugin
>> (get_versions())
>> - user selects a cluster template
>> * - this appears to be a new concept - I can't find mentions of it
>> elsewhere?  Is that where the cluster level config items are to be
>> edited/created (alluded to in "cluster_configs" above)?
>> * - controller calls get_supported_node_types()
>> * - There doesn't seem to be a provision for creating new node types.
>>  Rather, the plugin returns the set of supported node types.  How do we
>> account for new services or tailored combinations of services?
>>  - which node types are displayed to the user?  Given the large number
>> of services available in a hadoop deployment, the list of services and the
>> possible combinations can be rather large
>> * - For a selected node type, the set of applicable config items are
>> displayed in a "create node template" dialog
>>  - *how are the proper config items selected?*  The configs appear to
>> have a "component" attribute for each config item, but components are only
>> currently encoded into the node type description. The current node type is
>> just a string (e.g "jt+nn").  Discerning the set of components by parsing
>> the description seems error prone and possible confusing.  We believe there
>> is a need for an actual structure:
>>
>>  node_type:
>>  description
>>   components[]
>>  role
>>
>>  This would allow for:
>>  - descriptions that are more apt for the given plugin (e.g. "master",
>> "master with monitoring")
>>   - an ability to map the config items to the set of components
>> available from a given node type
>>   - an ability to discern the role of a give node (currently the "mgmt"
>> vs "slave" vs "master" decision seems to be based on the controller's
>> ability to parse up the set of components on a node?)
>>
>> * - There is an implied ability to configure host/node level config
>> overrides based on the "configs" attribute of a vm_group.  At what point
>> are those entered?
>> *
>> General Concerns:
>>
>> - node types - given the large number of services/components, creating a
>> set of node types that handles all possible valid combinations seems
>> daunting.  For example, assuming we have 5 components that can be validly
>> deployed to single node, we would have to define 5! node types to account
>> for all possible deployment combinations, wouldn't we?
>>
>> Perhaps it would be more appropriate to:
>>
>>  1)  Define a set of agreed upon node groups (e.g. "master", "slave",
>> "monitored slave", etc) across all plugins (get_supported_node_groups())
>>  2)  Allow plugins to return a set of components per role (e.g. for
>> "master" return "job tracker", "name node" etc) (get_supported_components
>> (node_group))
>>  3)  Allow users to designate the set of components they want to
>> associate to each node group (create_node_group(name, components[])
>>  4)  Query the plugin for the set of config items they make available
>> per component. (get_configs(component))
>>
>>   We look forward to your responses.
>>
>> -- Jon
>>
>>
>> --
>> Mailing list: https://launchpad.net/~savanna-all
>> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~savanna-all
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>

References