savanna-all team mailing list archive

Thread
Date
Re: Configuration approach recommendations

To: Sergey Lukjanov <slukjanov@xxxxxxxxxxxx>
From: Dmitry Mescheryakov <dmescheryakov@xxxxxxxxxxxx>
Date: Mon, 27 May 2013 14:05:35 +0400
Cc: "savanna-all@xxxxxxxxxxxxxxxxxxx" <savanna-all@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <57D7F035-5A7C-4B58-981D-AAC5037EA7AF@mirantis.com>
Jon, I've corrected the following docs:
https://wiki.openstack.org/wiki/Savanna/PluggableProvisioning/PluginAPI
https://wiki.openstack.org/wiki/Savanna/Templates


2013/5/25 Sergey Lukjanov <slukjanov@xxxxxxxxxxxx>

> Hi Jon,
>
> Sure, I think we'll update all affected blueprints on Monday - Tuesday.
>
> Sincerely yours,
> Sergey Lukjanov
> Software Engineer
> Mirantis Inc.
> GTalk: me@xxxxxxxxxxx
> Skype: lukjanovsv
>
> On May 24, 2013, at 17:52, Jon Maron <jmaron@xxxxxxxxxxxxxxx> wrote:
>
>
> On May 24, 2013, at 5:32 AM, Dmitry Mescheryakov <
> dmescheryakov@xxxxxxxxxxxx> wrote:
>
> Jon, after a discussion we decided to postpone the idea with adding
> process to config classification. We still think that it might be a
> valuable feature, but we also don't have time to implement it in Vanilla
> Hadoop plugin. Taking that into account seems like we are pretty aligned.
> Here is the summary of proposed changes:
>
> Redefine get_node_processes() in the following way:
>     Returns all supported services and node processes for a given Hadoop
> version. Each node process belongs to a single service and that
> relationship is reflected in the returned dict object. See example for
> details.
>     Returns: dictionary having entries (service -> list of processes)
>     Example return value: {"mapreduce": ["tasktracker", "jobtracker"],
> "hdfs": ["datanode", "namenode"]}
>
> Redefine 'config' object in the following way:
> config:
>     name
>     description
>     type
>     default_value
>     is_optional
>    * applicable_target*
> *    scope*
> *
> *
> Where
>     applicable_target = <service name> | “general”
>     scope = “node” | “cluster”
>     <service name> = “service:mapreduce” | “service:hdfs” | …
>
> Do you agree?
>
>
> Yes - this aligns with our thinking :)
>
> Just so we have a complete picture:  these modifications should affect the
> structure of the node group and cluster templates - the configuration
> should also reflect this service based approach.  Would you mind updating
> the templates wiki so we can see the nature of the modifications and make
> sure we're aligned there as well?
>
> Thanks!
>
>
>
> Thanks,
>
> Dmitry
>
>
>
>
> 2013/5/23 Jon Maron <jmaron@xxxxxxxxxxxxxxx>
>
>> We like this suggestion - it provides service mappings for properties and
>> allows the controller to make the appropriate groupings.  But are still
>> concerned about the focus on component based editing and the resultant
>> templates.
>>
>> First and foremost - we discussed this approach with our internal Hadoop
>> developers.  They indicate that there is no current resource (documentation
>> etc) that really details the mapping of configuration properties to their
>> associated components/processes.  This sort of mapping could ultimately be
>> provided, but it would it be a fairly significant effort and is not
>> currently planned.
>>
>> In addition, Let me illustrate a fundamental problem with component based
>> editing:
>>
>> The following is a possible node group template created via component
>> based UI panels (in this case, a panel that showed the properties available
>> for a name node, followed by a UI that showed the properties available for
>> a secondary name node):
>>
>> {
>>        id: "aee4-strf-o14s-fd34",
>>        flavor: "4",
>>        image: "ah91-aij1-u78x-iunm",
>>        name: ”HDFS master”
>>        description: “a template for big nodes ...”,
>>        plugin: “apache-hadoop”,
>>        hadoop_version: “1.1.1”
>>        node_processes: [“name node”, “secondary name node”]
>>        node_configs:
>>            {
>>                ”name node”:
>>                    {
>>                        fs.checkpoint.dir: /hadoop/hdfs/*one*
>>                        fs.checkpoint.period: *21600*,
>>                        ...
>>                    }
>>                ”secondary name node”:
>>                    {
>>                        fs.checkpoint.dir: /hadoop/hdfs/*two*
>>                        fs.checkpoint.period: *28600*,
>>                        ...
>>                    }
>>                ”OS settings”:
>>                    {
>>                          …
>>                    }
>>            }
>>    }
>>
>> In this instance, the user was presented with the option of configuring
>> the node group via a series of component based UIs and created this node
>> group template.  The properties you see above are actually associated to
>> the service and will end up in the same configuration file on each given
>> node within the node group.  So how do we, as a plugin, decide which
>> property value wins?  The somewhat artificial attempt to provide properties
>> on a component basis can potentially lead to these sort of issues in
>> multiple instances.
>>
>> We think it would be much better to align the templates with the way
>> properties are configured in a hadoop - on a cluster or node group basis,
>> not on a per component basis.  This does not mean that the UI can not
>> present the data in a way that is more palatable to the user - you can do
>> some paging and filtering to make the UI more usable.  But ultimately, the
>> entries the user makes should be aligned with the way properties are
>> persisted in Hadoop to avoid issues like the one above and to present the
>> user with a more "Hadoop-like" interface.
>>
>> I've posted some simple mock ups that show how you could present the user
>> with property editing facilities that are aligned with services while
>> allowing for an understanding of the components those properties affect.
>>  They are posted here:
>>
>> https://wiki.openstack.org/wiki/File:Node_Group_Editor.png
>> https://wiki.openstack.org/wiki/File:Service_Configuration_Editor.png
>>
>> Ultimately, I think we can work to achieving the UI and plugin
>> requirements:
>>
>> - UI:  provide an manageable interface for configuration setting.  As
>> long as you have an appropriate amount of metadata concerning the
>> configuration properties, you should be able to create a usable interface,
>> grouping the properties in a logical and usable fashion.
>>
>> - Plugin:  the properties are associated to associated to a particular
>> service or to a GENERAL grouping, and are scoped to the node group or to
>> the cluster.
>>
>> Note that the scope (group or cluster) is dependent on where it was
>> specified by the user.  All properties can be supported at both levels
>> (cluster and node), so there is no need to have a "scope" attribute in the
>> config object.  Rather, the association of a property to a cluster object
>> or node group indicates its scope.
>>
>> On May 23, 2013, at 6:47 AM, Dmitry Mescheryakov <
>> dmescheryakov@xxxxxxxxxxxx> wrote:
>>
>> Jon,
>>
>> We discussed that and we agree to implement this approach in current
>> phase. Regarding your modification proposals:
>>
>> 1) We want to keep applicable_target as a single-value field. How about
>> if we instead redefine plugin.get_node_processes() in the following way:
>> get_node_processes()
>>     Returns all supported services and node processes for a given Hadoop
>> version. Each node process belongs to a single service and that
>> relationship is reflected in the returned dict object. See example for
>> details.
>>     Returns: dictionary having entries (service -> list of processes)
>>     Example return value: {"mapreduce": ["tasktracker", "jobtracker"],
>> "hdfs": ["datanode", "namenode"]}
>>
>> In that case if plugin just specifies that
>> applicable_target="process:jobtracker", it will be enough for controller to
>> identify the process/service config belongs to. If plugin specifies
>> applicable_target="service:mapreduce", the controller will understand that
>> this is a service general parameter and which processes are affected.
>>
>> 2) It is plugin which returns list of supported configs in get_configs()
>> call. The 'scope' field in the 'config' object indicates to the controller,
>> where the config needs to be presented to user: either on cluster level, or
>> on node group level. Right?
>>
>> Thanks,
>>
>> Dmitry
>>
>>
>>
>>
>> 2013/5/22 Jon Maron <jmaron@xxxxxxxxxxxxxxx>
>>
>>>
>>> On May 22, 2013, at 9:52 AM, Dmitry Mescheryakov <
>>> dmescheryakov@xxxxxxxxxxxx> wrote:
>>>
>>> *
>>> Hello Jon,
>>>
>>> We considered using services instead of processes and found that they
>>> also have disadvantage. The problem with the services approach is in the
>>> UI. Consider the following example:
>>>
>>> User configures node group to run MapReduce workers, i.e. TaskTrackers
>>> processes. It does not matter if he creates a Node Group Template, or edits
>>> Node Group during cluster creation, since the UI will be similar in both
>>> cases. Since properties are categorized by services, user is asked to fill
>>> in all the parameters for that service (MapReduce) including those for
>>> JobTracker like “JobTracker heap size”, etc. As a result user has to dig
>>> through many irrelevant options during configuration, which is very bad. I
>>> think we should not blindly copy files configuration even if users got used
>>> to it. We are making a web console and we should use the advantages it
>>> provides over editing files. I.e. if we can filter out irrelevant options
>>> for user, then why not do it? That does not change configuration flow much,
>>> but at the same time it is much more convenient for a user.
>>> *
>>>
>>>
>>> In the case above, it is true that we would return all service related
>>> configuration properties.  However, we do specify the default values, and
>>> we may be able to specify components as well for the purposes of enhancing
>>> the UI.  The user would only have to modify the values for which the
>>> default does not makes sense.  The bottom line is that the user is
>>> presented with a single set of properties for a service because doing
>>> otherwise introduces race conditions and uncertainty with respect to which
>>> property value is actually used.  Let me illustrate again with the name
>>> node and secondary name node which likely are both deployed on the same
>>> instance.  In that case the property choices for each process/component are
>>> obviously the same.  If a user decided to vary the values for the same
>>> properties based on the two separate component property selection panels,
>>> which property value are we expected to actually use?  Remember, Hadoop
>>> does not configure components.  These values end up in node or service
>>> based configuration files, so the selections from both the name node and
>>> secondary name node ultimately end up in the same configuration file.
>>>  Which value selection are expected to select?  Again, this is not a
>>> contrived example.  More generally, component based configuration is simply
>>> not a configuration scheme with which Hadoop users are familiar.  We are
>>> lucky enough to have multiple developers of Hadoop working in the company,
>>> and every single one we've spoken to has questioned the component based
>>> configuration of Savanna.
>>>
>>> *
>>> As for Templates, our intention is to provide all users with ability to
>>> create their own templates, both for Cluster and Node Group. In fact we see
>>> no reason to reject a user have his own templates. We just _think_ that
>>> users will prefer administrators or more experienced users to prepare
>>> templates for them, because these people should be better at Hadoop cluster
>>> tweaking.
>>>
>>> Right now the main concern we have is timing. We already spent much time
>>> designing phase 2 and I believe that our initial design evolved into
>>> something much better. But we think that it is about time to freeze specs
>>> we have for phase 2 and start implementing them. At the same time we can
>>> have a background conversation on how we can improve the design in phase 3.
>>> We believe that it will not be hard to change this part specifically.
>>>
>>> The solution for the problem we see is to unify processes and services
>>> categorization. Config object can have the following 2-dimensional
>>> “coordinates”:
>>>
>>>    - applicable_target = <process name> | <service name> | “general”
>>>    - scope = “node” | “cluster”
>>>
>>> where
>>> <process name> = “process:tasktracker” | “process:jobtracker” |
>>> “process:datanode” | …
>>> <service name> = “service:mapreduce” | “service:hdfs” | …
>>>
>>> Here is a table example parameters for various combinations of
>>> target/scope:
>>>
>>>
>>> Cluster
>>> Node
>>> Process
>>> Don’t use this combination, use Service/Cluster instead
>>>
>>> JobTracker heap size, mapred.tasktracker.map.tasks.maximum
>>> Service
>>> dfs.replication, mapred.output.compression.type
>>> ?
>>> General
>>> user SSL key for cluster machines
>>> OS parameters like ulimits
>>>
>>>
>>>
>>> Again, as I said we propose to do this only after we complete Pluggable
>>> Provisioning Mechanism and make sure it is working. Right now we suggest to
>>> implement our old proposal with with processes, just to avoid further
>>> changes in design in this phase.
>>> *
>>>
>>>
>>> This approach is acceptable with some modification:
>>>
>>> 1)  applicable target should be modified to a list of targets, allowing
>>> for the specification of service/component or general, e.g
>>>
>>> applicable_targets = ["service:mapreduce", "process:job tracker"]
>>>
>>> or
>>>
>>> applicable_targets = ["general"]
>>>
>>> 2)  The scope attribute is probably unnecessary since the scope is
>>> implied by where the user_input is specified (It's not required for the
>>> returned config objects).  When attached to a node_group, the scope is
>>> "node".  When attached to the cluster object directly, the scope is
>>> "cluster".
>>>
>>> Also, I don't think this can wait till the next phase and I don't think
>>> it affects your development much for the following reasons:
>>>
>>> - the current form of the config and user_input objects remains
>>> unchanged.  We are just changing some of the values
>>> - for the time being we can try to return the associated process as part
>>> of the applicable targets list so that you can at least provide a hint on
>>> the service configuration page (and I don't believe the UIs work has begun
>>> anyhow)
>>>
>>> But from our perspective this will greatly enhance our capability of
>>> implementing the plugin since the bottom line is that we need to associate
>>> properties to services/node.  Attempting the artificial grouping of the
>>> properties per component/process complicates our ability to property
>>> structure the configuration of the Hadoop cluster.
>>>
>>> *
>>>
>>> Thanks,
>>>
>>> Dmitry
>>> *
>>>
>>>
>>> 2013/5/22 Jon Maron <jmaron@xxxxxxxxxxxxxxx>
>>>
>>>> We still have some concerns regarding the current configuration
>>>> approach.  I'd like to highlight two major issues:
>>>>
>>>> 1)  Component level configuration - Configuring at the component level
>>>> is contrary to the Hadoop configuration approach which is structured around
>>>> host level configuration (as opposed to component level).  Trying to
>>>> configure at the component level runs contrary to that configuration
>>>> philosophy and would likely be questioned by Hadoop users.  In addition,
>>>> this approach can be rather error prone.  For example, consider a
>>>> deployment in which the name node and secondary name node are hosted on the
>>>> same server (a not uncommon approach).  Both components obviously share a
>>>> great deal of configuration properties.  If a user is leveraging the UI and
>>>> is configuring these items at the component level he/she will:
>>>>
>>>> - repeat the configuration for each component, a process that may be
>>>> rather frustrating
>>>> - will have no idea which of the settings will actually be leveraged
>>>> since there is essentially a race condition here - the properties
>>>> potentially end up in the same configuration file on the given node group,
>>>> so which of the properties actually win?
>>>>
>>>> 2)  There doesn't appear to be a facility for making changes that span
>>>> node groups.  The cluster templates are essentially immutable - they are
>>>> created by an admin and are not seen as modifiable via the UI by users (at
>>>> least as far as we can tell).  The other configuration alternative
>>>> currently is to configure the specific node groups.  So, for example, how
>>>> do I approach the task of modifying 3 HDFS properties across all 10 node
>>>> groups I've defined?  It seems that with the current approach I will have
>>>> to repeatedly make the same modifications to each node group in turn?
>>>>
>>>> We believe both of these issues can be remedied by modifying the
>>>> configuration approach to be more service centric rather than component
>>>> centric.  While the cluster template still provides for global, immutable
>>>> settings, allowing users to configure at the service level will allow for
>>>> global changes across node groups.  We still want to address the node group
>>>> level configuration (host level overrides), so perhaps the config structure
>>>> could be redesigned as follows:
>>>>
>>>> *config*
>>>> Describes a single config parameter.
>>>> name
>>>> description
>>>>  type
>>>> Type could be string, integer, enum, array of [int, string]
>>>> default_value
>>>>  is_optional
>>>> * service - a service name or "GENERAL"*
>>>> *
>>>> *
>>>> To be clear about what the service attribute values mean:
>>>>
>>>> *service name* - this property value is a service-based property that
>>>> is valid at to be applied to the service across a cluster or to a specific
>>>> node group (i.e. the property can be applied to all instances of the
>>>> service across the cluster or to a specific set of hosts in a node group).
>>>>  The scope is determined by where the user selected the property value (the
>>>> node group interface or the cluster interface) and specified in the
>>>> user_input "scope" attribute (see below)
>>>> *"GENERAL"* - this property value is not specific to a hadoop service.
>>>>  It can be specified at the cluster or node group level as well.
>>>> * *
>>>> In the UI, the interfaces can provide the setting of all values.  The
>>>> UI can categorize the properties based on service etc to present to the
>>>> user.  If the user is in a node group configuration panel, the
>>>> configuration settings will be scoped to the node group.  If they are in a
>>>> cluster template or the like, the property value should be scoped to the
>>>> entire cluster.
>>>>
>>>> The user_input object remains unchanged.  User input values assigned to
>>>> the cluster_configs attribute of the Cluster object are cluster scope
>>>> properties (GENERAL or service based).  User input values associated to the
>>>> embedded node groups (node_configs attribute within a particular node_group
>>>> in node_groups list  of the cluster object) are associated to the specific
>>>> node group (GENERAL or service based).
>>>>
>>>> Again, we feel that this aligns the interface much more closely with
>>>> the way users interact with Hadoop.  The attempt to align configuration
>>>> with specific service components is somewhat contrived and introduces an
>>>> impedance mismatch that users will probably reject.
>>>>
>>>>
>>>> --
>>>> Mailing list: https://launchpad.net/~savanna-all
>>>> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~savanna-all
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>>
>>
>>
>
> --
> Mailing list: https://launchpad.net/~savanna-all
> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~savanna-all
> More help   : https://help.launchpad.net/ListHelp
>
>
>
Follow ups

Re: Configuration approach recommendations
From: Jon Maron, 2013-05-28
References

Configuration approach recommendations
From: Jon Maron, 2013-05-21
Re: Configuration approach recommendations
From: Dmitry Mescheryakov, 2013-05-22
Re: Configuration approach recommendations
From: Jon Maron, 2013-05-22
Re: Configuration approach recommendations
From: Dmitry Mescheryakov, 2013-05-23
Re: Configuration approach recommendations
From: Jon Maron, 2013-05-23
Re: Configuration approach recommendations
From: Dmitry Mescheryakov, 2013-05-24
Re: Configuration approach recommendations
From: Jon Maron, 2013-05-24
Re: Configuration approach recommendations
From: Sergey Lukjanov, 2013-05-24