savanna-all team mailing list archive

Thread
Date

Re: Questions concerning config object

To: "'Jon Maron'" <jmaron@xxxxxxxxxxxxxxx>, <savanna-all@xxxxxxxxxxxxxxxxxxx>
From: "Alexander Ignatov" <aignatov@xxxxxxxxxxxx>
Date: Thu, 16 May 2013 18:26:18 +0400
In-reply-to: <B0B847E1-6A8E-4B72-BACF-825BE810C6BE@hortonworks.com>
Thread-index: AQKeGdFK1XxkD+Kf/N3cWrFRwC8OepdoTj2w

Jon,

 

Configs grouping by node processes is more convenient from user perspective.


It is more likely user wants to override some conf property for certain
process (DataNode, TaskTracker, etc) rather than some. xml, .sh or some
other type of files.

By the way the same picture is on the Ambari and Cloudera UIs. I mean
grouping by processes.

Each plugin can store its own information about configuration parameters and
theirs file destination.

 

It seems 'applicable_node_processes' is not correct name. Because this
attribute of Config object is not only applicable for hadoop node processes.


It can be as general property of the whole cluster like you mentioned in the
your second concern. Also this attribute can describe node OS property like
ulimits, ssh configs etc.

I think we should rename 'applicable_node_processes' to 'target' or just
'destination' where destination could be node process, node OS specific
property or general cluster property.

 

Regards,

Alexander Ignatov

 

From: Savanna-all
[mailto:savanna-all-bounces+aignatov=mirantis.com@xxxxxxxxxxxxxxxxxxx] On
Behalf Of Jon Maron
Sent: Thursday, May 16, 2013 1:24 AM
To: savanna-all@xxxxxxxxxxxxxxxxxxx
Subject: [Savanna-all] Questions concerning config object

 

The current Savanna documentation proposes a Config object with the
following attributes:

 

            name

            description

            type

            default_value

            is_optional

            applicable_node_processes 

 

  Node processes seem to correlate to Hadoop components.  

 

 I see a number of problems with this proposal:

 

 1)  The proposal makes an assumption that properties are grouped by node
processes/components.  Although some properties are clearly dedicated to
certain processes, it appears that for the most part properties are
associated with, and grouped by, specific site configuration files.  As a
matter of fact, there is some effort in Ambari around decoupling services
and configuration.

 2)  There are some general properties that aren't necessarily dedicated to
a specific process but are rather more general in nature.  In those cases it
seems that an indicator specifying which configuration file the property
resides in is more appropriate.

 

 It just seems like the categorization by node process (or component) is
somewhat artificial in the Hadoop environment.  Rather, it seems like it's
be more natural to have the following structure:

 

            name

            description

            type

            default_value

            is_optional

            destination_file

            

  I welcome your thoughts on the matter.  Thanks!

 

-- Jon

Follow ups

Re: Questions concerning config object
From: Jon Maron, 2013-05-16

References

Questions concerning config object
From: Jon Maron, 2013-05-15