fuel-dev team mailing list archive

Thread
Date

Re: Stop openstack patching feature

To: Evgeniy L <eli@xxxxxxxxxxxx>, Bogdan Dobrelya <bdobrelia@xxxxxxxxxxxx>
From: David Easter <deaster@xxxxxxxxxxxx>
Date: Tue, 09 Sep 2014 10:07:54 -0700
Cc: Igor Kalnitsky <ikalnitsky@xxxxxxxxxxxx>, fuel-dev <fuel-dev@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CABfuu9qW1Tkb0EKp5oBWFtXKoz+q7V0b3+CJ_TFd=JQhY=RwwQ@mail.gmail.com>
Thread-topic: [Fuel-dev] Stop openstack patching feature
User-agent: Microsoft-MacOutlook/14.4.4.140807

Deleting a node and forcing it to be re-deployed from scratch during an
update would not be a positive user experience.  I’d rather explain to a
customer that while new deployments can be stopped, updating can’t (but it
can be rolled back).  This would be preferable to explaining that stopping
the update would result in having to redploy the entire cloud again.

+1 for reducing https://bugs.launchpad.net/fuel/+bug/1364907 down from
critical status.  I think it reasonable to defer it rather than closing it
as will not fix, however, to give us a chance to think about if we can solve
it in a future release.

Thanks,

- David J. Easter
  Director of Product Management,   Mirantis, Inc.

From:  Evgeniy L <eli@xxxxxxxxxxxx>
Date:  Tuesday, September 9, 2014 at 9:14 AM
To:  Bogdan Dobrelya <bdobrelia@xxxxxxxxxxxx>
Cc:  Igor Kalnitsky <ikalnitsky@xxxxxxxxxxxx>, fuel-dev
<fuel-dev@xxxxxxxxxxxxxxxxxxx>
Subject:  Re: [Fuel-dev] Stop openstack patching feature

I don't think that we should implement this feature even
in api, because in this case user will be able to interrupt
patching via cli, I think it's really risky to provide such feature
especially if we know that user can loose his production
nodes.

My suggestion is to remove the ticket [1] from 5.1 or set
it as won't fix.

[1] https://bugs.launchpad.net/fuel/+bug/1364907

On Tue, Sep 9, 2014 at 1:44 PM,  <bdobrelia@xxxxxxxxxxxx> wrote:
> Perhaps, some ideas could be taken from [0] ([1])
> Note, that the linked full spec doc [1] status is rather a brainstorming
> discussion than the spec ready for implementation.
> I strongly believe we should follow the suggested concepts (finite-machine
> states in Nailgun DB, running in HA mode, of cause) it in order to track
> offline / interrupted statuses for nodes (including the master node) as well.
> 
> [0] https://blueprints.launchpad.net/fuel/+spec/nailgun-unified-object-model
> [1] https://etherpad.openstack.org/p/nailgun-unified-object-model
> 
> Regards,
> Bogdan Dobrelya.
> 
> Sent from Windows Mail
> 
> From: Mike Scherbakov <mailto:mscherbakov@xxxxxxxxxxxx>
> Sent: ?Tuesday?, ?September? ?9?, ?2014 ?10?:?15? ?AM
> To: Vladimir Kuklin <mailto:vkuklin@xxxxxxxxxxxx>
> Cc: Igor Kalnitsky <mailto:ikalnitsky@xxxxxxxxxxxx> , fuel-dev
> <mailto:fuel-dev@xxxxxxxxxxxxxxxxxxx>
> 
> Folks,
> I was the one who initially requested this. I thought it's going to be pretty
> similar to Stop Deployment. I becomes obvious, that it is not.
> 
> I'm fine if we have it in API. Though I think what is much more important here
> is an ability for the user to choose a few hosts for patching first, in order
> to check how patching would work on a very small part of the cluster. Ideally
> we would even move workloads to other nodes before doing patching. We should
> disable scheduling of workloads for sure for these experimental hosts.
> Then user can run patching against these nodes, and see how it goes. If all
> goes fine, patching can be applied to the rest of the environment. I do not
> think though that we should do all, let's say 100 nodes, at once. This sounds
> dangerous to me. I think we would need to come up with some less dangerous
> scenario.
> 
> Also, let's think and work on possible failures. What if Fuel Master node goes
> off during patching? What is going to be affected? How we can complete
> patching when Fuel Master comes back online?
> 
> Or compute node under patching breaks for some reason (e.g. disk issues or
> memory), how would it affect the patching process? How we can safely continue
> patching of other nodes?
> 
> Thanks,
> 
> On Tue, Sep 9, 2014 at 12:08 PM, Vladimir Kuklin <vkuklin@xxxxxxxxxxxx> wrote:
>> 
>> Sorry again. Look 2 messages below, please.
>> 09 сент. 2014 г. 12:06 пользователь "Vladimir Kuklin" <vkuklin@xxxxxxxxxxxx>
>> написал:
>>> 
>>> Sorry, hit reply instead of replyall.
>>> 
>>> 09 сент. 2014 г. 12:05 пользователь "Vladimir Kuklin" <vkuklin@xxxxxxxxxxxx>
>>> написал:
>>>> 
>>>> +1
>>>> 
>>>> Also, I think, we should add stop patching at least to api in order to
>>>> allow advanced users and service team to do what they want.
>>>> 
>>>> 09 сент. 2014 г. 12:02 пользователь "Igor Kalnitsky"
>>>> <ikalnitsky@xxxxxxxxxxxx> написал:
>>>> 
>>>>> What we should to do with nodes in case of interrupt patching? I think
>>>>> we need to mark them for re-deployment, since nodes' state may be
>>>>> broken.
>>>>> 
>>>>> Any opinion?
>>>>> 
>>>>> - Igor
>>>>> 
>>>>> On Mon, Sep 8, 2014 at 3:28 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
>>>>>> > Hi,
>>>>>> >
>>>>>> > We were working on implementation of experimental feature
>>>>>> > where user could interrupt openstack patching procedure [1].
>>>>>> >
>>>>>> > It's not as easy to implement as we thought it would be.
>>>>>> > Current stop deployment mechanism [2] stops puppet, erases
>>>>>> > nodes and reboots them into bootstrap. It's ok for stop
>>>>>> > deployment, but it's not ok for patching, because user
>>>>>> > can loose his data. We can rewrite some logic in nailgun
>>>>>> > and in orchestrator to stop puppet and not to erase nodes.
>>>>>> > But I'm not sure if it works correctly because such use
>>>>>> > case wasn't tested. And I can see the problems like
>>>>>> > yum/apt-get locks cleaning after puppet interruption.
>>>>>> >
>>>>>> > As result I have several questions:
>>>>>> > 1. should we try to make it work for the current release?
>>>>>> > 2. if we shouldn't, will we need this feature for the future
>>>>>> >     releases? Definitely additional design and research is
>>>>>> >     required.
>>>>>> >
>>>>>> > [1] https://bugs.launchpad.net/fuel/+bug/1364907
>>>>>> > [2]
>>>>>> > 
>>>>>> https://github.com/stackforge/fuel-astute/blob/b622d9b36dbdd1e03b282b9ee5
>>>>>> b7435ba649e711/lib/astute/server/dispatcher.rb#L163-L164
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Mailing list: https://launchpad.net/~fuel-dev
>>>>>> > Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>>> > Unsubscribe : https://launchpad.net/~fuel-dev
>>>>>> > More help   : https://help.launchpad.net/ListHelp
>>>>>> >
>>>>> 
>>>>> --
>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>> More help   : https://help.launchpad.net/ListHelp
>> 
>> --
>> Mailing list: https://launchpad.net/~fuel-dev
>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~fuel-dev
>> More help   : https://help.launchpad.net/ListHelp
>> 
> 
> 
> 
> -- 
> Mike Scherbakov
> #mihgen
> 
> 
> --
> Mailing list: https://launchpad.net/~fuel-dev
> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fuel-dev
> More help   : https://help.launchpad.net/ListHelp
> 

-- Mailing list: https://launchpad.net/~fuel-dev Post to     :
fuel-dev@xxxxxxxxxxxxxxxxxxx Unsubscribe : https://launchpad.net/~fuel-dev
More help   : https://help.launchpad.net/ListHelp

Follow ups

Re: Stop openstack patching feature
From: Mike Scherbakov, 2014-09-10

References

Stop openstack patching feature
From: Evgeniy L, 2014-09-08
Re: Stop openstack patching feature
From: Igor Kalnitsky, 2014-09-09
Re: Stop openstack patching feature
From: Vladimir Kuklin, 2014-09-09
Re: Stop openstack patching feature
From: Mike Scherbakov, 2014-09-09
Re: Stop openstack patching feature
From: bdobrelia, 2014-09-09
Re: Stop openstack patching feature
From: Evgeniy L, 2014-09-09