← Back to team overview

savanna-all team mailing list archive

Re: Questions concerning config object

 

On May 16, 2013, at 10:26 AM, "Alexander Ignatov" <aignatov@xxxxxxxxxxxx> wrote:

> Jon,
>  
> Configs grouping by node processes is more convenient from user perspective.
> It is more likely user wants to override some conf property for certain process (DataNode, TaskTracker, etc) rather than some. xml, .sh or some other type of files.
> By the way the same picture is on the Ambari and Cloudera UIs. I mean grouping by processes.

That is true.  However, Ambari has discovered that by grouping config in this manner things became pretty complicated and resulted in huge numbers of web service invocations once the number of hosts increased in a cluster.  They are therefore actively moving away from such a design.

> Each plugin can store its own information about configuration parameters and theirs file destination.
>  
> It seems 'applicable_node_processes' is not correct name. Because this attribute of Config object is not only applicable for hadoop node processes.
> It can be as general property of the whole cluster like you mentioned in the your second concern. Also this attribute can describe node OS property like ulimits, ssh configs etc.
> I think we should rename 'applicable_node_processes' to 'target' or just 'destination' where destination could be node process, node OS specific property or general cluster property.

I gather that you are proposing that this setting is essentially opaque to the controller, since:

1)  The config object is an object returned by the plugin and displayed by the UI via some interface
2)  The user should not modify this setting
3)  This setting is subsequently interpreted by the plugin during cluster configuration and creation.

If that is correct, can the plugin use this "target" attribute in a manner they feel is appropriate to their implementation?  In other words, couldn't we simply specify the target XML file if we deemed that as an appropriate use of the attribute?

>  
> Regards,
> Alexander Ignatov
>  
> From: Savanna-all [mailto:savanna-all-bounces+aignatov=mirantis.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Jon Maron
> Sent: Thursday, May 16, 2013 1:24 AM
> To: savanna-all@xxxxxxxxxxxxxxxxxxx
> Subject: [Savanna-all] Questions concerning config object
>  
> The current Savanna documentation proposes a Config object with the following attributes:
>  
>             name
>             description
>             type
>             default_value
>             is_optional
>             applicable_node_processes 
>  
>   Node processes seem to correlate to Hadoop components.  
>  
>  I see a number of problems with this proposal:
>  
>  1)  The proposal makes an assumption that properties are grouped by node processes/components.  Although some properties are clearly dedicated to certain processes, it appears that for the most part properties are associated with, and grouped by, specific site configuration files.  As a matter of fact, there is some effort in Ambari around decoupling services and configuration.
>  2)  There are some general properties that aren't necessarily dedicated to a specific process but are rather more general in nature.  In those cases it seems that an indicator specifying which configuration file the property resides in is more appropriate.
>  
>  It just seems like the categorization by node process (or component) is somewhat artificial in the Hadoop environment.  Rather, it seems like it's be more natural to have the following structure:
>  
>             name
>             description
>             type
>             default_value
>             is_optional
>             destination_file
>            
>   I welcome your thoughts on the matter.  Thanks!
>  
> -- Jon


Follow ups

References