← Back to team overview

savanna-all team mailing list archive

Re: Hadoop Provider Integration



Hey Erik, team,

Thank you for deeper dive into plugin mechanism architecture. That is
really a step forward.

Let us discuss the general approach first. When we started the architecture
design, we also first thought of an IoC concept, similar to what you
suggest. But after some thinking, we found that it is better to split one
big “create cluster” call into a number of smaller ones, retaining more
control on the core side. The benefits of such approach (and disadvantages
of the opposite one) are:

* In case of one “create cluster” call each plugin will be allowed to have
different behavior, which means we will introduce a lot of undefined
behavior. Especially in error cases, because each plugin could handle error
case differently. Another thing is that handling such cases might require
deep knowledge of OpenStack from plugin creator. Our goal is to simplify
plugin creation process by handling all the OpenStack related logic inside
the Savanna code, not in the plugin code.

In case of separate calls for each step:

* Plugin is separated into several consecutive parts/methods. Transitions
between these methods might be persisted which would increase reliability
of the workflow.

In perspective, it might allow to run plugin in distributed environment.

* Separate and defined by API plugin methods allow code reuse. It also
serves as a documentation of plugins responsibilities.

* It’ll allow timeout handling for each step on the core side.

Methods for interacting directly with a VM/Server instance like install(),
open_file(), interactive execute() don’t seem to be relevant to the Plugin
API. It’s better to keep such method away from the API to keep that API as
simple as possible. It can be just a set of helper methods under utils

We suggest to move provider specific details to separate blueprints.
Examples are “HDP specific details” in create cluster flow and in add hosts


Dmitry, on behalf of Mirantis team


2013/4/30 Erik Bergenholtz <ebergenholtz@xxxxxxxxxxxxxxx>

> Dmitry - uploaded a new doc that looks a bit better:
> https://wiki.openstack.org/w/images/5/5e/Savanna_Deployment_Engine_Architecture.pdf
> Erik
> On Apr 30, 2013, at 4:20 PM, Dmitry Mescheryakov <
> dmescheryakov@xxxxxxxxxxxx> wrote:
> Hey Erik,
> Some tables in "Savanna Deployment Engine Architecture" doc are flattened
> out, see attached screenshot for example. Could you reassemble the PDF with
> correct tables' sizes?
> Thanks,
> Dmitry
> 2013/4/30 Erik Bergenholtz <ebergenholtz@xxxxxxxxxxxxxxx>
>> Team - yet another update. See
>> https://wiki.openstack.org/w/images/9/97/Savanna_hadoop_host_group_mapping.pdf for
>> a document illustrating how hadoop nodes get mapped to provisioned VMs.
>> Erik
>> On Apr 30, 2013, at 11:07 AM, Erik Bergenholtz <
>> ebergenholtz@xxxxxxxxxxxxxxx> wrote:
>> Team,  John has updated the below referenced documents with better
>> descriptions of the flows (same links apply).
>> Cheers,
>> Erik
>> On Apr 30, 2013, at 6:35 AM, Erik Bergenholtz <
>> ebergenholtz@xxxxxxxxxxxxxxx> wrote:
>> Team,
>> Below are a few documents intended to describe only a slightly modified
>> approach to hadoop provider integration into Savanna (see existing
>> blueprint in draft:
>> https://blueprints.launchpad.net/savanna/+spec/pluggable-cluster-provisioning). These
>> documents should not be considered a blueprint, but a vehicle for
>> continuing discussion on the topic.
>> To summarize there are two changes to note:
>> 1. There is some IoC introduced into the design allowing hadoop plugin
>> providers flexibility to integrate into Savanna while reducing the burden
>> on the Savanna controller itself. This differs from the current approach of
>> the controller invoking APIs on the provider at specific lifecycle points.
>> 2. Attempts have been made to keep normalization of management APIs
>> across providers at a minimum at the controller level. Our view is that
>> existing Hadoop users are already familiar with their Hadoop distribution
>> management API (CDH, Ambari, MapR etc.) and as such would want to leverage
>> existing investments vs. learning a new management API specific to Savanna.
>> This eases adoption and lowers the barrier of entry of adoption of Hadoop
>> on OpenStack.
>> Documents to review:
>> Savanna Deployment Engine Architecture<https://wiki.openstack.org/w/images/5/5e/Savanna_Deployment_Engine_Architecture.pdf> -
>> Puts forth the architecture of the deployment engine
>> Savanna_add_hosts_flow<https://wiki.openstack.org/w/images/d/dc/Savanna_add_hosts_flow.pdf> -
>> Describes sequence of steps executed in order to add a host to an existing
>> Hadoop Cluster
>> Savanna_create_cluster_flow<https://wiki.openstack.org/w/images/0/0f/Savanna_create_cluster_flow.pdf> -
>> Describes the sequence of steps executed in order to create a new cluster
>> Savanna_invoke_provider_rest_api_flow<https://wiki.openstack.org/w/images/a/a6/Savanna_invoke_provider_rest_api_flow.pdf> -
>> Describes the sequence of steps executed in a REST request by the provider
>> plugin on the controller.
>> Please review at your earliest convenience and let us know your feedback.
>> Best,
>> Jon Maron, John Speidel and Erik Bergenholtz
>> --
>> Mailing list: https://launchpad.net/~savanna-all
>> Post to     : savanna-all@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~savanna-all
>> More help   : https://help.launchpad.net/ListHelp
> <Screen Shot 2013-04-30 at 1.15.11 PM.png>

Follow ups