← Back to team overview

fuel-dev team mailing list archive

Re: Stop deployment concerns

 

David,

   1. Do we have a consensus here? Can you drive it with the team to
   completion?
   2. On a separate note, I think we should schedule a call to go through
   all the features and discuss requirements. To ensure that you and dev team
   are on the same page.

Thanks,
Roman

On Friday, November 22, 2013, Bogdan Dobrelya wrote:

>  On 11/22/2013 11:16 AM, Mike Scherbakov wrote:
>
>  + fuel-dev
>
>  We had a meeting on the topic yesterday. Research shows the following.
>
>  It would be great to be able to stop deployment at any moment, and then
> continue with the redeployment only failed nodes. However:
>
>    - If network configuration is changed - environment will not be
>    operational after deployment
>       - user may change net CIDRs, and without an additional
>       functionality in Fuel it is not currently possible to reconfigure OpenStack
>       (replace network information in OpenStack database)
>    - If some settings are changed - the same
>       - such as passwords, etc. - for example, controllers are already
>       deployed, and computes will get new information
>
> So, we have come to the decision that resetting of the whole environment
> is essential at the moment. We expect the following workflow:
>
>    1. If it becomes obvious that the deployment will not finish with the
>    success, user goes to Actions tab and clicks on "Reset Environment" button.
>    2. Environment changes the status to "Resetting"
>    3. All settings on env become unlocked, and user is allowed to change
>    anything. Settings stay the same as when user clicked "Deploy"
>    4. Resetting of environment implies rebooting all the nodes to
>    boostrap state. When it is done, status of env is changed to "New", and
>    "Deploy" button becomes active.
>    5. When user is done with re-configuration, he clicks "Deploy". Fuel
>    should use same IP addresses / hostnames as at the time of initial
>    deployment, if no changes are made to networking.
>
> Thanks,
>
>
> On Wed, Nov 20, 2013 at 7:14 PM, Mike Scherbakov <mscherbakov@xxxxxxxxxxxx
> > wrote:
>
> + Evgeniy, Nick
>
>
> On Wed, Nov 20, 2013 at 7:01 PM, David Easter <deaster@xxxxxxxxxxxx>wrote:
>
>  I thought about this some more last night and what about this for a
> resolution?
>
>
>    1. When stop deployment is done, any successfully deployed are flagged
>    as successful and would not be reinstalled when Deploy Changes is pressed
>    again.
>    2. If a customer wants to reset the environment and start over, they
>    can use the "Reset environment" option to wipe the partially installed
>    environment and start over.
>    3. Otherwise, when Deploy Changes is clicked again, Fuel will try to
>    deploy only the unfinished or error-state nodes again… just as it does
>    today.
>
> That way, the customer has the option of starting over or just continuing
> from where they left off.  If controllers or network install failed, Fuel
> would consider that an unrecoverable error condition and just reinstall
> those nodes
>
> 1) I believe, we should reflect related Environment Operations changes in
> Nailgun API as well
> https://docs.google.com/a/mirantis.com/document/d/1KQPEG62wBF-U-s8mUzAcP3_rLKOBgyEyUY9e9yKE49U/edit#heading=h.qcspsp3wasyy
> 2) Having an ability to reset the given node as well as the deployment, is
> vital for cluster self-healing. F.e., if we have STONITH'ed the failed
> controller node and want just redeploy it from the scratch, we might use
> nailgun API to reset the node to ensure it would be re-provisioned and
> re-deployed at the next boot...
>
>
> 1. this feature required only for developers (or maybe services), because
> in this case user will not be able to reconfigure cluster via rest-api
> (i.e. UI, CLI) after deployment was Stopped. If we allow configuration,
> then deployment in 90% cases likely to fail.
> 2. we cannot interrupt network configuration being in progress, to resolve
> this issue we need some kind of recovery mechanism for networks
> 3. also we cannot interrupt apt-get (and maybe yum) because it creates a
> lock file and puppet will fail when we will try to run it for a
>
> This body part will be downloaded on demand.
>
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Researcher TechLead, Mirantis, Inc.
> +38 (066) 051 07 53
> Skype bogdando_at_yahoo.com
> 38, Lenina ave.
> Kharkov, Ukrainewww.mirantis.comwww.mirantis.rubdobrelia@xxxxxxxxxxxx <javascript:_e({}, 'cvml', 'bdobrelia@xxxxxxxxxxxx');>
>
>

Follow ups

References