savanna-all team mailing list archive

Thread
Date
Re: Configuration approach recommendations

To: Jon Maron <jmaron@xxxxxxxxxxxxxxx>
From: Dmitry Mescheryakov <dmescheryakov@xxxxxxxxxxxx>
Date: Fri, 24 May 2013 13:32:22 +0400
Cc: "savanna-all@xxxxxxxxxxxxxxxxxxx" <savanna-all@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <2FA1119A-841C-4C85-B92F-987352CC3777@hortonworks.com>
Jon, after a discussion we decided to postpone the idea with adding process
to config classification. We still think that it might be a valuable
feature, but we also don't have time to implement it in Vanilla Hadoop
plugin. Taking that into account seems like we are pretty aligned. Here is
the summary of proposed changes:

Redefine get_node_processes() in the following way:
    Returns all supported services and node processes for a given Hadoop
version. Each node process belongs to a single service and that
relationship is reflected in the returned dict object. See example for
details.
    Returns: dictionary having entries (service -> list of processes)
    Example return value: {"mapreduce": ["tasktracker", "jobtracker"],
"hdfs": ["datanode", "namenode"]}

Redefine 'config' object in the following way:
config:
    name
    description
    type
    default_value
    is_optional
   * applicable_target*
*    scope*
*
*
Where
    applicable_target = <service name> | “general”
    scope = “node” | “cluster”
    <service name> = “service:mapreduce” | “service:hdfs” | …

Do you agree?

Thanks,

Dmitry




2013/5/23 Jon Maron <jmaron@xxxxxxxxxxxxxxx>

> We like this suggestion - it provides service mappings for properties and
> allows the controller to make the appropriate groupings.  But are still
> concerned about the focus on component based editing and the resultant
> templates.
>
> First and foremost - we discussed this approach with our internal Hadoop
> developers.  They indicate that there is no current resource (documentation
> etc) that really details the mapping of configuration properties to their
> associated components/processes.  This sort of mapping could ultimately be
> provided, but it would it be a fairly significant effort and is not
> currently planned.
>
> In addition, Let me illustrate a fundamental problem with component based
> editing:
>
> The following is a possible node group template created via component
> based UI panels (in this case, a panel that showed the properties available
> for a name node, followed by a UI that showed the properties available for
> a secondary name node):
>
> {
>        id: "aee4-strf-o14s-fd34",
>        flavor: "4",
>        image: "ah91-aij1-u78x-iunm",
>        name: ”HDFS master”
>        description: “a template for big nodes ...”,
>        plugin: “apache-hadoop”,
>        hadoop_version: “1.1.1”
>        node_processes: [“name node”, “secondary name node”]
>        node_configs:
>            {
>                ”name node”:
>                    {
>                        fs.checkpoint.dir: /hadoop/hdfs/*one*
>                        fs.checkpoint.period: *21600*,
>                        ...
>                    }
>                ”secondary name node”:
>                    {
>                        fs.checkpoint.dir: /hadoop/hdfs/*two*
>                        fs.checkpoint.period: *28600*,
>                        ...
>                    }
>                ”OS settings”:
>                    {
>                          …
>                    }
>            }
>    }
>
> In this instance, the user was presented with the option of configuring
> the node group via a series of component based UIs and created this node
> group template.  The properties you see above are actually associated to
> the service and will end up in the same configuration file on each given
> node within the node group.  So how do we, as a plugin, decide which
> property value wins?  The somewhat artificial attempt to provide properties
> on a component basis can potentially lead to these sort of issues in
> multiple instances.
>
> We think it would be much better to align the templates with the way
> properties are configured in a hadoop - on a cluster or node group basis,
> not on a per component basis.  This does not mean that the UI can not
> present the data in a way that is more palatable to the user - you can do
> some paging and filtering to make the UI more usable.  But ultimately, the
> entries the user makes should be aligned with the way properties are
> persisted in Hadoop to avoid issues like the one above and to present the
> user with a more "Hadoop-like" interface.
>
> I've posted some simple mock ups that show how you could present the user
> with property editing facilities that are aligned with services while
> allowing for an understanding of the components those properties affect.
>  They are posted here:
>
> https://wiki.openstack.org/wiki/File:Node_Group_Editor.png
> https://wiki.openstack.org/wiki/File:Service_Configuration_Editor.png
>
> Ultimately, I think we can work to achieving the UI and plugin
> requirements:
>
> - UI:  provide an manageable interface for configuration setting.  As long
> as you have an appropriate amount of metadata concerning the configuration
> properties, you should be able to create a usable interface, grouping the
> properties in a logical and usable fashion.
>
> - Plugin:  the properties are associated to associated to a particular
> service or to a GENERAL grouping, and are scoped to the node group or to
> the cluster.
>
> Note that the scope (group or cluster) is dependent on where it was
> specified by the user.  All properties can be supported at both levels
> (cluster and node), so there is no need to have a "scope" attribute in the
> config object.  Rather, the association of a property to a cluster object
> or node group indicates its scope.
>
> On May 23, 2013, at 6:47 AM, Dmitry Mescheryakov <
> dmescheryakov@xxxxxxxxxxxx> wrote:
>
> Jon,
>
> We discussed that and we agree to implement this approach in current
> phase. Regarding your modification proposals:
>
> 1) We want to keep applicable_target as a single-value field. How about if
> we instead redefine plugin.get_node_processes() in the following way:
> get_node_processes()
>     Returns all supported services and node processes for a given Hadoop
> version. Each node process belongs to a single service and that
> relationship is reflected in the returned dict object. See example for
> details.
>     Returns: dictionary having entries (service -> list of processes)
>     Example return value: {"mapreduce": ["tasktracker", "jobtracker"],
> "hdfs": ["datanode", "namenode"]}
>
> In that case if plugin just specifies that
> applicable_target="process:jobtracker", it will be enough for controller to
> identify the process/service config belongs to. If plugin specifies
> applicable_target="service:mapreduce", the controller will understand that
> this is a service general parameter and which processes are affected.
>
> 2) It is plugin which returns list of supported configs in get_configs()
> call. The 'scope' field in the 'config' object indicates to the controller,
> where the config needs to be presented to user: either on cluster level, or
> on node group level. Right?
>
> Thanks,
>
> Dmitry
>
>
>
>
> 2013/5/22 Jon Maron <jmaron@xxxxxxxxxxxxxxx>
>
>>
>> On May 22, 2013, at 9:52 AM, Dmitry Mescheryakov <
>> dmescheryakov@xxxxxxxxxxxx> wrote:
>>
>> *
>> Hello Jon,
>>
>> We considered using services instead of processes and found that they
>> also have disadvantage. The problem with the services approach is in the
>> UI. Consider the following example:
>>
>> User configures node group to run MapReduce workers, i.e. TaskTrackers
>> processes. It does not matter if he creates a Node Group Template, or edits
>> Node Group during cluster creation, since the UI will be similar in both
>> cases. Since properties are categorized by services, user is asked to fill
>> in all the parameters for that service (MapReduce) including those for
>> JobTracker like “JobTracker heap size”, etc. As a result user has to dig
>> through many irrelevant options during configuration, which is very bad. I
>> think we should not blindly copy files configuration even if users got used
>> to it. We are making a web console and we should use the advantages it
>> provides over editing files. I.e. if we can filter out irrelevant options
>> for user, then why not do it? That does not change configuration flow much,
>> but at the same time it is much more convenient for a user.
>> *
>>
>>
>> In the case above, it is true that we would return all service related
>> configuration properties.  However, we do specify the default values, and
>> we may be able to specify components as well for the purposes of enhancing
>> the UI.  The user would only have to modify the values for which the
>> default does not makes sense.  The bottom line is that the user is
>> presented with a single set of properties for a service because doing
>> otherwise introduces race conditions and uncertainty with respect to which
>> property value is actually used.  Let me illustrate again with the name
>> node and secondary name node which likely are both deployed on the same
>> instance.  In that case the property choices for each process/component are
>> obviously the same.  If a user decided to vary the values for the same
>> properties based on the two separate component property selection panels,
>> which property value are we expected to actually use?  Remember, Hadoop
>> does not configure components.  These values end up in node or service
>> based configuration files, so the selections from both the name node and
>> secondary name node ultimately end up in the same configuration file.
>>  Which value selection are expected to select?  Again, this is not a
>> contrived example.  More generally, component based configuration is simply
>> not a configuration scheme with which Hadoop users are familiar.  We are
>> lucky enough to have multiple developers of Hadoop working in the company,
>> and every single one we've spoken to has questioned the component based
>> configuration of Savanna.
>>
>> *
>> As for Templates, our intention is to provide all users with ability to
>> create their own templates, both for Cluster and Node Group. In fact we see
>> no reason to reject a user have his own templates. We just _think_ that
>> users will prefer administrators or more experienced users to prepare
>> templates for them, because these people should be better at Hadoop cluster
>> tweaking.
>>
>> Right now the main concern we have is timing. We already spent much time
>> designing phase 2 and I believe that our initial design evolved into
>> something much better. But we think that it is about time to freeze specs
>> we have for phase 2 and start implementing them. At the same time we can
>> have a background conversation on how we can improve the design in phase 3.
>> We believe that it will not be hard to change this part specifically.
>>
>> The solution for the problem we see is to unify processes and services
>> categorization. Config object can have the following 2-dimensional
>> “coordinates”:
>>
>>    - applicable_target = <process name> | <service name> | “general”
>>    - scope = “node” | “cluster”
>>
>> where
>> <process name> = “process:tasktracker” | “process:jobtracker” |
>> “process:datanode” | …
>> <service name> = “service:mapreduce” | “service:hdfs” | …
>>
>> Here is a table example parameters for various combinations of
>> target/scope:
>>
>>
>> Cluster
>> Node
>> Process
>> Don’t use this combination, use Service/Cluster instead
>>
>> JobTracker heap size, mapred.tasktracker.map.tasks.maximum
>> Service
>> dfs.replication, mapred.output.compression.type
>> ?
>> General
>> user SSL key for cluster machines
>> OS parameters like ulimits
>>
>>
>>
>> Again, as I said we propose to do this only after we complete Pluggable
>> Provisioning Mechanism and make sure it is working. Right now we suggest to
>> implement our old proposal with with processes, just to avoid further
>> changes in design in this phase.
>> *
>>
>>
>> This approach is acceptable with some modification:
>>
>> 1)  applicable target should be modified to a list of targets, allowing
>> for the specification of service/component or general, e.g
>>
>> applicable_targets = ["service:mapreduce", "process:job tracker"]
>>
>> or
>>
>> applicable_targets = ["general"]
>>
>> 2)  The scope attribute is probably unnecessary since the scope is
>> implied by where the user_input is specified (It's not required for the
>> returned config objects).  When attached to a node_group, the scope is
>> "node".  When attached to the cluster object directly, the scope is
>> "cluster".
>>
>> Also, I don't think this can wait till the next phase and I don't think
>> it affects your development much for the following reasons:
>>
>> - the current form of the config and user_input objects remains
>> unchanged.  We are just changing some of the values
>> - for the time being we can try to return the associated process as part
>> of the applicable targets list so that you can at least provide a hint on
>> the service configuration page (and I don't believe the UIs work has begun
>> anyhow)
>>
>> But from our perspective this will greatly enhance our capability of
>> implementing the plugin since the bottom line is that we need to associate
>> properties to services/node.  Attempting the artificial grouping of the
>> properties per component/process complicates our ability to property
>> structure the configuration of the Hadoop cluster.
>>
>> *
>>
>> Thanks,
>>
>> Dmitry
>> *
>>
>>
>> 2013/5/22 Jon Maron <jmaron@xxxxxxxxxxxxxxx>
>>
>>> We still have some concerns regarding the current configuration
>>> approach.  I'd like to highlight two major issues:
>>>
>>> 1)  Component level configuration - Configuring at the component level
>>> is contrary to the Hadoop configuration approach which is structured around
>>> host level configuration (as opposed to component level).  Trying to
>>> configure at the component level runs contrary to that configuration
>>> philosophy and would likely be questioned by Hadoop users.  In addition,
>>> this approach can be rather error prone.  For example, consider a
>>> deployment in which the name node and secondary name node are hosted on the
>>> same server (a not uncommon approach).  Both components obviously share a
>>> great deal of configuration properties.  If a user is leveraging the UI and
>>> is configuring these items at the component level he/she will:
>>>
>>> - repeat the configuration for each component, a process that may be
>>> rather frustrating
>>> - will have no idea which of the settings will actually be leveraged
>>> since there is essentially a race condition here - the properties
>>> potentially end up in the same configuration file on the given node group,
>>> so which of the properties actually win?
>>>
>>> 2)  There doesn't appear to be a facility for making changes that span
>>> node groups.  The cluster templates are essentially immutable - they are
>>> created by an admin and are not seen as modifiable via the UI by users (at
>>> least as far as we can tell).  The other configuration alternative
>>> currently is to configure the specific node groups.  So, for example, how
>>> do I approach the task of modifying 3 HDFS properties across all 10 node
>>> groups I've defined?  It seems that with the current approach I will have
>>> to repeatedly make the same modifications to each node group in turn?
>>>
>>> We believe both of these issues can be remedied by modifying the
>>> configuration approach to be more service centric rather than component
>>> centric.  While the cluster template still provides for global, immutable
>>> settings, allowing users to configure at the service level will allow for
>>> global changes across node groups.  We still want to address the node group
>>> level configuration (host level overrides), so perhaps the config structure
>>> could be redesigned as follows:
>>>
>>> *config*
>>> Describes a single config parameter.
>>> name
>>> description
>>>  type
>>> Type could be string, integer, enum, array of [int, string]
>>> default_value
>>>  is_optional
>>> * service - a service name or "GENERAL"*
>>> *
>>> *
>>> To be clear about what the service attribute values mean:
>>>
>>> *service name* - this property value is a service-based property that
>>> is valid at to be applied to the service across a cluster or to a specific
>>> node group (i.e. the property can be applied to all instances of the
>>> service across the cluster or to a specific set of hosts in a node group).
>>>  The scope is determined by where the user selected the property value (the
>>> node group interface or the cluster interface) and specified in the
>>> user_input "scope" attribute (see below)
>>> *"GENERAL"* - this property value is not specific to a hadoop service.
>>>  It can be specified at the cluster or node group level as well.
>>> * *
>>> In the UI, the interfaces can provide the setting of all values.  The UI
>>> can categorize the properties based on service etc to present to the user.
>>>  If the user is in a node group configuration panel, the configuration
>>> settings will be scoped to the node group.  If they are in a cluster
>>> template or the like, the property value should be scoped to the entire
>>> cluster.
>>>
>>> The user_input object remains unchanged.  User input values assigned to
>>> the cluster_configs attribute of the Cluster object are cluster scope
>>> properties (GENERAL or service based).  User input values associated to the
>>> embedded node groups (node_configs attribute within a particular node_group
>>> in node_groups list  of the cluster object) are associated to the specific
>>> node group (GENERAL or service based).
>>>
>>> Again, we feel that this aligns the interface much more closely with the
>>> way users interact with Hadoop.  The attempt to align configuration with
>>> specific service components is somewhat contrived and introduces an
>>> impedance mismatch that users will probably reject.
>>>
>>>
>>> --
>>> Mailing list: https://launchpad.net/~savanna-all
>>> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~savanna-all
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>>
>
>
Follow ups

Re: Configuration approach recommendations
From: Jon Maron, 2013-05-24
References

Configuration approach recommendations
From: Jon Maron, 2013-05-21
Re: Configuration approach recommendations
From: Dmitry Mescheryakov, 2013-05-22
Re: Configuration approach recommendations
From: Jon Maron, 2013-05-22
Re: Configuration approach recommendations
From: Dmitry Mescheryakov, 2013-05-23
Re: Configuration approach recommendations
From: Jon Maron, 2013-05-23