← Back to team overview

savanna-all team mailing list archive

Configuration approach recommendations


We still have some concerns regarding the current configuration approach.  I'd like to highlight two major issues:

1)  Component level configuration - Configuring at the component level is contrary to the Hadoop configuration approach which is structured around host level configuration (as opposed to component level).  Trying to configure at the component level runs contrary to that configuration philosophy and would likely be questioned by Hadoop users.  In addition, this approach can be rather error prone.  For example, consider a deployment in which the name node and secondary name node are hosted on the same server (a not uncommon approach).  Both components obviously share a great deal of configuration properties.  If a user is leveraging the UI and is configuring these items at the component level he/she will:

	- repeat the configuration for each component, a process that may be rather frustrating
	- will have no idea which of the settings will actually be leveraged since there is essentially a race condition here - the properties potentially end up in the same configuration file on the given node group, so which of the properties actually win?

2)  There doesn't appear to be a facility for making changes that span node groups.  The cluster templates are essentially immutable - they are created by an admin and are not seen as modifiable via the UI by users (at least as far as we can tell).  The other configuration alternative currently is to configure the specific node groups.  So, for example, how do I approach the task of modifying 3 HDFS properties across all 10 node groups I've defined?  It seems that with the current approach I will have to repeatedly make the same modifications to each node group in turn?

We believe both of these issues can be remedied by modifying the configuration approach to be more service centric rather than component centric.  While the cluster template still provides for global, immutable settings, allowing users to configure at the service level will allow for global changes across node groups.  We still want to address the node group level configuration (host level overrides), so perhaps the config structure could be redesigned as follows:

	Describes a single config parameter.
		Type could be string, integer, enum, array of [int, string]
	service - a service name or "GENERAL"

To be clear about what the service attribute values mean:

service name - this property value is a service-based property that is valid at to be applied to the service across a cluster or to a specific node group (i.e. the property can be applied to all instances of the service across the cluster or to a specific set of hosts in a node group).  The scope is determined by where the user selected the property value (the node group interface or the cluster interface) and specified in the user_input "scope" attribute (see below)
"GENERAL" - this property value is not specific to a hadoop service.  It can be specified at the cluster or node group level as well.
In the UI, the interfaces can provide the setting of all values.  The UI can categorize the properties based on service etc to present to the user.  If the user is in a node group configuration panel, the configuration settings will be scoped to the node group.  If they are in a cluster template or the like, the property value should be scoped to the entire cluster.

The user_input object remains unchanged.  User input values assigned to the cluster_configs attribute of the Cluster object are cluster scope properties (GENERAL or service based).  User input values associated to the embedded node groups (node_configs attribute within a particular node_group in node_groups list  of the cluster object) are associated to the specific node group (GENERAL or service based).

Again, we feel that this aligns the interface much more closely with the way users interact with Hadoop.  The attempt to align configuration with specific service components is somewhat contrived and introduces an impedance mismatch that users will probably reject.

Follow ups