← Back to team overview

savanna-all team mailing list archive

Some thoughts on configuration


Hi all,

As a quick introduction, I'm one of the engineers working on Cloudera
Manager and have been looking at how it would work in conjunction with
Savanna, I've been reading through the docs and the recent conversations on
configuration and scoping, and I'd like to talk a bit about how Cloudera
Manager handles configuration and how this maps to the Savanna API as I
currently understand it.

CM Terminology:

* Cluster: A logical cluster, which contains a set of hosts and the
services deployed on those hosts
* Service Type: A type of service (duh): "HDFS", "MAPREDUCE", etc
* Service Instance: A concrete instance of a service, running on a cluster:
"My first HDFS", etc
* Role Type: A particular type of role within a service: "NAMENODE",
* Role Instance: A concrete instance of a role type, assigned to a specific
host/node: "NAMENODE-1 on host1.domain.com", etc. Only one instance of a
given role type can be assigned to a single host
* Process: The actual running process associated with a role instance. So
while a process only exists while it's running, the role instance always
* Role Group: A set of role instances, within a single service, of a single
role type, that share common configurations.
* Host: A host - not very profound.
* Host Template: A set of role groups. When a template is applied to a
host, for each role group, a role instance is created and assigned to that

When it comes to configuration, CM defines configs at the Service Type and
Role Type level. So a given service or role type has a fixed set of
possible configurations associated with it.

For example:

HDFS: Replication Factor (default 3)
Namenode: Listening Port (default 8020)
Datanode: Handler Count (default 3)

and so on.

When it comes time to set a configuration value, that value is associated
with an instance. Service type config values are always associated with a
service instance, but role type config values can be associated with either
a role group or a role instance - with the role instance value overriding
the role group value. In this way it's possible to define values that apply
to a whole group, but also specialize certain instances where necessary.

At the time a process is started, CM will generate the process' relevant
config files on the fly, based on a set of internal logic that map configs
to actual entries in config files (and/or where appropriate, environment
variables or command line arguments). In most cases, these are 1:1 but
sometimes the handling is more complicated. For example, when generating
the fs.default.name, we combine our knowledge of the hostname of the
Namenode with the user specific listening port config. As these config
files are generated per-process, they will look different for different
role types - so a datanode's hdfs-site.xml looks different from a
namenode's hdfs-site.xml - and only contains the config entries that are
relevant to it. Configuration files are regenerated every time a role
instance is (re)started to ensure consistency.

Some configuration is indirect - coming from dependency services, rather
than from the service itself. This is modelled through the use of
dependency configurations. So a mapreduce service instance has a config
that indicates which hdfs service instance it depends on, and in this way
it is able to discover the fs.default.name and other relevant configuration.

Finally, services can have a Gateway role type, which indicates a host that
does not run any processes for a service, but which can act as a client for
the service. When a host is assigned a gateway role instance, CM will
ensure that the system-wide config directories in /etc are correctly
populated to connect to the service. (Remembering that process config files
are private and per-process, they have no effect on the system-wide
configuration that client applications see)

Hosts also have a set of configurations associated with them. Values can be
defined at the 'all hosts' level or the individual host level.

Now, with all that said, we can consider how these concepts map to the
configuration model described in the Provisioning Plugin API.

The config object:

Unsurprisingly, most of the fields here are directly mappable, with the
difficult ones being the applicable_target and the scope.

Currently defined applicable targets are 'general' and 'service<service
Currently defined scopes are 'node' or 'cluster'

Let's now consider how these combinations map to the CM concepts and then
identify which CM concepts cannot be expressed.

1) applicable_target=general, scope=cluster

This maps to an 'all hosts' configuration

2) applicable_target=general, scope=node

This maps to a 'single host' configuration

3) applicable_target=service:instance, scope=cluster

This maps to a service type configuration

4) applicable_target=service:instance, scope=node

This doesn't exactly map to anything, unfortunately. service type
configurations cannot be specialized to individual nodes, and the configs
that apply to an individual node are scoped at the roletype level.

So, we are left in a somewhat difficult situation where the majority of our
configurations don't actually map cleanly to anything. Now, we can
obviously do poor man's namespacing and prefix the config names that expose
through the plugin (so listening port would be "namenode:listening_port"
for example). If we did this, we'd be able to map (4) to a role instance
level config (as there's only one role instance per type per host, we can
work out which instance a namespaced config applies to)

Then what does it mean for (3) with a role type config? The only thing it
can mean is a config assigned to an implicit role group that covers all the
hosts of the given role type in the cluster.

This should be functional in the short term, but obviously we'd like to
more explicitly support these concepts to avoid relying on more fragile
mechanisms like namespacing.

In an ideal world, we'd like to be able to have an
applicable_target=service:instance:roletype and a scope=node_group which
would allow us to directly express role instance configs on a node and
configs against role groups.


5) applicable_target=service:instance:roletype, scope=node

This is a role instance config

6) applicable_target=service:instance:roletype, scope=node_group

This is a role group config, roughly speaking. As CM role groups need not
be aligned across services, it implies a stricter model than CM allows, but
I think it's workable. Exposing role groups as a full capability would
probably be challenging, and I think anyone wanting to use this would want
to use the convert() api and provide a CM deployment descriptor.

7) applicable_target=service:instance:roletype, scope=cluster

This would not be supported, as it doesn't map to any remaining concept.
Also, (4) would not be used for anything either.

Does this seem like a reasonable thing to do - perhaps not for phase 2
given the current timing, but beyond that?



Follow ups