← Back to team overview

savanna-all team mailing list archive

Re: advanced hadoop configuration

 

Hi Ruslan,

Thanks for the questions.
As you note, in the advanced tab, the user will specify the number of vm's and the associated flavors. This information is used to provision the requested vm's(along with vm image). After provisioning the vm's the controller will call hadoop_provider.configure_cluster(cluster_description). In the cluster description, the hadoop plugin specific configuration will be included along with vm information. From this point, it is the responsibility of the hadoop plugin to properly provision the hadoop cluster given the set of provisioned vm's and the plugin specific hadoop configuration. This will include mapping the appropriate service components to each vm. The hadoop plugin will return some hadoop cluster topology information from the configure_cluster call to the controller.

The important thing to note is that the controller will have nothing to do with provisioning the hadoop cluster. The hadoop provider will be solely responsible for provisioning a hadoop cluster on top of a given vm topology.

Thanks,
John





On 5/10/13 4:06 PM, Ruslan Kamaldinov wrote:
Jon, John,

Could you please shed more light on how these Advanced configs could be processed by Savanna controller?

There is an example stack configuration in "Hadoop Blueprint Specification". And there is "cardinality" field - in our case it's the number of VMs per specific service, data-node for example.

Let's imagine user passed such config to Savanna and defined two VM Groups (https://wiki.openstack.org/wiki/File:Savanna_Create_Cluster_Mockup_-_Advanced_Tab.png).

What happens then? How will Savanna controller be able to create VMs with specific to service properties? How will it be possible to use different data node placement options?

How will Savanna be able to store cluster information in templates (see https://blueprints.launchpad.net/savanna/+spec/hierarchical-templates)?


Thanks,
Ruslan


On Fri, May 10, 2013 at 6:56 PM, Jon Maron <jmaron@xxxxxxxxxxxxxxx <mailto:jmaron@xxxxxxxxxxxxxxx>> wrote:

    Hi,

      We have uploaded some mockups that illustrate the "Advanced"
    configuration mechanism we've been proposing:

    http://wiki.openstack.org/wiki/File:Savanna_Create_Cluster_Mockup_-_Standard_Tab.png

    http://wiki.openstack.org/wiki/File:Savanna_Create_Cluster_Mockup_-_Advanced_Tab.png

      The advanced mechanism essentially leverages existing APIs
    (configure_cluster(), create_cluster()) but the cluster
    description parameter passed to those methods includes the user
    selected configuration file that is specific to the Hadoop
    provider rather than the standard list of configuration items.

    -- Jon

    On May 8, 2013, at 6:40 PM, John Speidel <jspeidel@xxxxxxxxxxxxxxx
    <mailto:jspeidel@xxxxxxxxxxxxxxx>> wrote:

    Here are more details on the advanced Hadoop configuration that
    we discussed the other day.


    Savanna Advanced Hadoop Configuration

    In addition to the proposed “config items” based Hadoop
    configuration, it will be necessary to provide an advanced
    configuration mechanism in Savanna.This mechanism should allow
    for very fine-grained and extensive configuration of a Hadoop
    cluster provisioned by Savanna.It is expected that a user would
    likely use the simple node group based configuration for cases
    where little configuration is required and use the advanced
    configuration where more control is desired.The advanced cluster
    configuration would be specific to a Hadoop plugin and it’s
    content opaque to the Savanna controller.

    For reference, here is a link to the Hadoop Blueprint
    Specification <https://issues.apache.org/jira/browse/AMBARI-1783>
    proposed by the Ambari.


          Advanced Hadoop Configuration Use Cases

    ·A user has an existing on premise or non-virtualized cluster and
    wants to clone the cluster(topology/configuration not data) in a
    virtualized environment using Savanna.

    In this case, the user will export a configuration for the
    existing cluster using provider/management product specific
    tooling.This configuration can then be used to create a new
    cluster using Savanna.

    ·A user wants to provision a new cluster in a virtualized
    environment using savanna and needs very fine-grained control of
    the Hadoop cluster configuration.This could include configuration
    of host level roles, configuration of a large number of
    properties across many optional services and potentially even
    Hadoop stack configuration related to packages and repository
    locations.


          Changes to UI Workflow

    To allow a user to specify an advanced configuration, some UI
    changes are necessary.

    The create cluster screen would need an “advanced Hadoop
    configuration” tab or button.In the initial implementation, the
    advanced configuration screen would allow a user to specify the
    location of a plugin specific configuration file (select file
    dialog).This configuration file would contain all necessary
    Hadoop related configuration.In future releases, we may want a
    link to provider specific tooling, which could be used to
    create/edit provider configurations.

    The UI would still need to allow a user to specify VM details
    such as flavor, count, etc., but the user wouldn’t specify node
    groups or configuration for the VM’s.Instead, host/role mapping
    would be specified in the provider specific configuration file.


          Changes to Hadoop Plugin SPI

    The addition of “Advanced Hadoop Configuration” using plugin
    specific configuration will result in small changes to the
    proposed Hadoop Plugin SPI.

    cluster_description: The cluster description object would need to
    be updated to contain an advanced_configuration field in addition
    to cluster_configs.In the case of a user providing an advanced
    configuration, it would be available in advanced_configuration
    and cluster_configs would be empty.

    configure_cluster(..)

    Because the provider specific configuration is opaque to Savanna,
    it might be necessary for the plugin to return some cluster
    topology information from this method for rendering purposes.The
    specifics of this information would be dependent on what cluster
    information is required by Savanna.


    **

-- Mailing list: https://launchpad.net/~savanna-all
    <https://launchpad.net/%7Esavanna-all>
    Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
    <mailto:savanna-all@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~savanna-all
    <https://launchpad.net/%7Esavanna-all>
    More help   : https://help.launchpad.net/ListHelp


    --
    Mailing list: https://launchpad.net/~savanna-all
    <https://launchpad.net/%7Esavanna-all>
    Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
    <mailto:savanna-all@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~savanna-all
    <https://launchpad.net/%7Esavanna-all>
    More help   : https://help.launchpad.net/ListHelp




References