fuel-dev team mailing list archive
-
fuel-dev team
-
Mailing list archive
-
Message #00710
Re: Issues with RabbitMQ update
hi, comments inline
On Thu, Mar 27, 2014 at 10:50 AM, Vladimir Sharshov
<vsharshov@xxxxxxxxxxxx>wrote:
>
> I try to reproduce problem with new rabbitmq from this email and all works
> fine without any delay with 7 astute workers.
>
> I think we should build ISO with new RabbitMQ and ttl fixes (today i spend
> time to reinvestigate already solved problem with ttl).
>
Do you know easy way to do it?
>
> Several questions for you:
>
> - does this problem repeat only in HA-case
> https://bugs.launchpad.net/fuel/+bug/1278336 ?
>
> No, I'm testing it by adding two nodes to cluster and pressing "Verify
netoworks' in Netowork Tab. I'm waiting 10 min to see results on GUI.
>
> - does problem repeat every time or episodically?
>
> Almost every time. Once it worked, with 2 astute workers, but after
restart it stopped to work.
>
> - does this problem affect 'generate diagnostic snapshot' (it take
> much longer time)?
>
> I don't know. I tested it on 4.1 branch. I will repeat tests on trunk
version.
> I use simple CentOS cluster (1 controller + 1 compute) and 'generate
> diagnostic snapshot', because a few month ago in this cases UI
> have significant delays. As result — all work without delay. Now try with
> HA.
>
>
> On Tue, Mar 25, 2014 at 3:41 PM, Mike Scherbakov <mscherbakov@xxxxxxxxxxxx
> > wrote:
>
>> Great findings, Lukasz!
>>
>> Adding larger audience of fuel-dev..
>>
>>
>> On Tue, Mar 25, 2014 at 3:37 PM, Lukasz Oles <loles@xxxxxxxxxxxx> wrote:
>>
>>> Vladimir,
>>>
>>> there is no ISO, just install newest rabbitmq. I attached rpm package
>>> for centos.
>>>
>>> I have done some more investigation and number of workers actually
>>> doesn't matter. It just gives random results, but I think I found solution.
>>>
>>> Naily is using asynchronous library to communicate with Rabbitmq. It
>>> uses amqp library which uses EventMachine. In Naily event loop is running
>>> in main thread but consumer is running in another and publisher in yet
>>> another thread.
>>>
>>> To solve the problem with hanging I moved the code for publisher and
>>> consumer to EM::next_tick block. After this everything is working again
>>> now.
>>> EM::next_tick does two things. First, it schedules code to run in next
>>> event loop iteration. Second it runs this code in event loop thread. I'm
>>> not sure which of this things helps.
>>> Debugging async code in threads is really hard. Why in the first place
>>> Naily is using async library?
>>>
>>> What do you think about it? Maybe it would be better just move to
>>> synchronous library like bunny?
>>>
>>> Please remember I'm not ruby programmer so I can be missing something
>>> here.
>>>
>>> Regards
>>>
>>>
>>> On Tue, Mar 25, 2014 at 10:42 AM, Vladimir Sharshov <
>>> vsharshov@xxxxxxxxxxxx> wrote:
>>>
>>>> Guys, please share link to ISO. Without it i could not say anything
>>>> useful about potencial problem with naily. Thanks!
>>>>
>>>>
>>>> On Mon, Mar 24, 2014 at 4:34 PM, Andrey Danin <adanin@xxxxxxxxxxxx>wrote:
>>>>
>>>>> A huge ttl value was set in order to allow nodes with unsynchronized
>>>>> time be able to use mcollective. if a master node has a local time more
>>>>> than 6000 seconds in past in comparison with target nodes, these target
>>>>> nodes will not be able to answer via mcollective.
>>>>>
>>>>>
>>>>> On Mon, Mar 24, 2014 at 2:07 PM, Vladimir Sharshov <
>>>>> vsharshov@xxxxxxxxxxxx> wrote:
>>>>>
>>>>>> Hi all!
>>>>>>
>>>>>> > When I changed number of workers in naily from 3 to 2 everything
>>>>>> started to work
>>>>>> At now moment we increase this value to 7 as i remember. What about
>>>>>> shared connection - at now moment it works without any issues. Due to this
>>>>>> limitation only 2 clients in new version looks very strange.
>>>>>>
>>>>>> Please share link to iso, i try to reproduce and investigate this
>>>>>> problem. Thanks!
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 19, 2014 at 1:36 PM, Mike Scherbakov <
>>>>>> mscherbakov@xxxxxxxxxxxx> wrote:
>>>>>>
>>>>>>> Vladimir - I think you've been working with Naily workers, any
>>>>>>> thoughts on the issue?
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 19, 2014 at 1:04 PM, Andrey Korolyov <
>>>>>>> akorolev@xxxxxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> On 03/19/2014 12:52 PM, Dmitry Pyzhov wrote:
>>>>>>>> > + more guys.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Wed, Mar 19, 2014 at 12:10 PM, Lukasz Oles <loles@xxxxxxxxxxxx
>>>>>>>> > <mailto:loles@xxxxxxxxxxxx>> wrote:
>>>>>>>> >
>>>>>>>> > Hello guys,
>>>>>>>> >
>>>>>>>> > After a lot of testing and debugging finally I have something
>>>>>>>> to share.
>>>>>>>> >
>>>>>>>> > First change is in mcollective settings. In
>>>>>>>> > /etc/mcollective/server.cfg value ttl = 2000000000 is too
>>>>>>>> big.
>>>>>>>> > Rabbitmq returns error. I changed it to 6000, without this
>>>>>>>> > mcollective will not work.
>>>>>>>>
>>>>>>>> The problem is not in value itself but in way how rmq drivers
>>>>>>>> pushes it.
>>>>>>>> Somehow it turns as a *concatenation* of default value around 10k
>>>>>>>> and
>>>>>>>> this one, which is definitely too large for first one` concatenation
>>>>>>>> result. Just remove this value entirely as I did before from the
>>>>>>>> config,
>>>>>>>> three hours are acceptable enough.
>>>>>>>> >
>>>>>>>> > Another problem is with task status update. In my tests it
>>>>>>>> hangs for
>>>>>>>> > about 10 minutes. After that task is updated. Unfortunately
>>>>>>>> it's not
>>>>>>>> > a problem with python but with naily. When I changed number of
>>>>>>>> > workers in naily from 3 to 2 everything started to work. I
>>>>>>>> think
>>>>>>>> > it's because all thread are using the same connection and
>>>>>>>> chanel to
>>>>>>>> > publish results but I'm still investigating it.
>>>>>>>> >
>>>>>>>> > Regards
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Thu, Mar 13, 2014 at 12:38 PM, Dmitry Pyzhov
>>>>>>>> > <dpyzhov@xxxxxxxxxxxx <mailto:dpyzhov@xxxxxxxxxxxx>> wrote:
>>>>>>>> >
>>>>>>>> > Lukasz,
>>>>>>>> >
>>>>>>>> > Feel free to contact us if you need anything else.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Thu, Mar 13, 2014 at 3:22 PM, Lukasz Oles <
>>>>>>>> loles@xxxxxxxxxxxx
>>>>>>>> > <mailto:loles@xxxxxxxxxxxx>> wrote:
>>>>>>>> >
>>>>>>>> > ok, thx for rpm
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Thu, Mar 13, 2014 at 11:40 AM, Dmitry Burmistrov
>>>>>>>> > <dburmistrov@xxxxxxxxxxxx <mailto:
>>>>>>>> dburmistrov@xxxxxxxxxxxx>>
>>>>>>>> > wrote:
>>>>>>>> >
>>>>>>>> > Package rabbitmq-server has been built from
>>>>>>>> changeset:
>>>>>>>> > http://gerrit.mirantis.com/13455
>>>>>>>> > RPM Repository URL:
>>>>>>>> > http:///
>>>>>>>> osci-obs.vm.mirantis.net:82/centos-fuel-5.0-stable-13455/centos
>>>>>>>> > <
>>>>>>>> http://osci-obs.vm.mirantis.net:82/centos-fuel-5.0-stable-13455/centos
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Package rabbitmq-server has been built from
>>>>>>>> changeset:
>>>>>>>> > http://gerrit.mirantis.com/13457
>>>>>>>> > DEB Repository URL:
>>>>>>>> > http:///
>>>>>>>> osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.0-stable-13457/ubuntu
>>>>>>>> > <
>>>>>>>> http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.0-stable-13457/ubuntu
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Thu, Mar 13, 2014 at 2:11 PM, Dmitry Pyzhov
>>>>>>>> > <dpyzhov@xxxxxxxxxxxx <mailto:
>>>>>>>> dpyzhov@xxxxxxxxxxxx>> wrote:
>>>>>>>> > > Lukasz,
>>>>>>>> > >
>>>>>>>> > > Sorry for long response. Our OSCI team will
>>>>>>>> build a
>>>>>>>> > package. Dmitry B, could
>>>>>>>> > > you reply with the download link when it is
>>>>>>>> available?
>>>>>>>> > Ticket OSCI-1016.
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > On Wed, Mar 12, 2014 at 2:18 PM, Lukasz Oles
>>>>>>>> > <loles@xxxxxxxxxxxx <mailto:loles@xxxxxxxxxxxx>>
>>>>>>>> wrote:
>>>>>>>> > >>
>>>>>>>> > >> Sure, I will look into it. Can you give me a
>>>>>>>> link to
>>>>>>>> > rabbitmq rpm which
>>>>>>>> > >> you used?
>>>>>>>> > >>
>>>>>>>> > >> Regards,
>>>>>>>> > >>
>>>>>>>> > >>
>>>>>>>> > >> On Wed, Mar 12, 2014 at 11:10 AM, Dmitry Pyzhov
>>>>>>>> > <dpyzhov@xxxxxxxxxxxx <mailto:
>>>>>>>> dpyzhov@xxxxxxxxxxxx>>
>>>>>>>> > >> wrote:
>>>>>>>> > >>>
>>>>>>>> > >>> Lukasz,
>>>>>>>> > >>>
>>>>>>>> > >>> actually we have no idea what is wrong with
>>>>>>>> fresh
>>>>>>>> > rabbitmq. For some
>>>>>>>> > >>> reason refresh of task status takes too much
>>>>>>>> time.
>>>>>>>> > Dmitry tried to find the
>>>>>>>> > >>> root cause, but did not succeed. Could you
>>>>>>>> > investigate the issue?
>>>>>>>> > >>>
>>>>>>>> > >>>
>>>>>>>> > >>> On Tue, Mar 11, 2014 at 9:49 PM, Lukasz Oles
>>>>>>>> > <loles@xxxxxxxxxxxx <mailto:loles@xxxxxxxxxxxx>>
>>>>>>>> wrote:
>>>>>>>> > >>>>
>>>>>>>> > >>>> Dmitry,
>>>>>>>> > >>>>
>>>>>>>> > >>>> rabbitmq update looks interesting, I can
>>>>>>>> look into
>>>>>>>> > it. Do I need any
>>>>>>>> > >>>> additional information?
>>>>>>>> > >>>>
>>>>>>>> > >>>> regards,
>>>>>>>> > >>>>
>>>>>>>> > >>>>
>>>>>>>> > >>>> On Tue, Mar 11, 2014 at 2:05 PM, Dmitry
>>>>>>>> Pyzhov
>>>>>>>> > <dpyzhov@xxxxxxxxxxxx <mailto:
>>>>>>>> dpyzhov@xxxxxxxxxxxx>>
>>>>>>>> > >>>> wrote:
>>>>>>>> > >>>>>
>>>>>>>> > >>>>> Great!
>>>>>>>> > >>>>>
>>>>>>>> > >>>>> Lukasz, could you help us with rabbitmq
>>>>>>>> update? We
>>>>>>>> > faced an issue with
>>>>>>>> > >>>>> it:
>>>>>>>> https://bugs.launchpad.net/fuel/+bug/1278336
>>>>>>>> > >>>>>
>>>>>>>> > >>>>> Also, could you participate it design
>>>>>>>> review:
>>>>>>>> > >>>>>
>>>>>>>> >
>>>>>>>> https://docs.google.com/document/d/1zqV58LZBLQ-0gllb_i3MyIKIMj-Qx8ELJohjcWs459s/edit?usp=sharing
>>>>>>>> > >>>>>
>>>>>>>> > >>>>>
>>>>>>>> > >>>>> On Mon, Mar 10, 2014 at 9:01 PM, Mike
>>>>>>>> Scherbakov
>>>>>>>> > >>>>> <mscherbakov@xxxxxxxxxxxx
>>>>>>>> > <mailto:mscherbakov@xxxxxxxxxxxx>> wrote:
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Lukasz,
>>>>>>>> > >>>>>> please take any bugs from
>>>>>>>> > https://launchpad.net/fuel/+milestone/5.0
>>>>>>>> > >>>>>> which are not assigned to particular
>>>>>>>> person. You
>>>>>>>> > are likely to be interested
>>>>>>>> > >>>>>> in those which are assigned to
>>>>>>>> "fuel-python". Of
>>>>>>>> > course, it's preferred to
>>>>>>>> > >>>>>> work on Critical and High priority bugs in
>>>>>>>> a
>>>>>>>> > first order.
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> We are in a design phase for 5.0. Please
>>>>>>>> take a
>>>>>>>> > look at
>>>>>>>> > >>>>>>
>>>>>>>> >
>>>>>>>> https://mirantis.jira.com/wiki/display/PRD/5.0+-+Mirantis+OpenStack+release+home+page
>>>>>>>> .
>>>>>>>> > >>>>>> I'm discussing this still with management,
>>>>>>>> and we
>>>>>>>> > will likely have only part
>>>>>>>> > >>>>>> of what is on the page. Your comments and
>>>>>>>> input
>>>>>>>> > into design docs (which you
>>>>>>>> > >>>>>> can find following blueprint link, then
>>>>>>>> "Read the
>>>>>>>> > full spec") is very
>>>>>>>> > >>>>>> welcome.
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Start looking over and try to identify
>>>>>>>> spot which
>>>>>>>> > is in most interest
>>>>>>>> > >>>>>> of you. Dmitry/Evgeny will help to
>>>>>>>> identify areas
>>>>>>>> > where help is mostly
>>>>>>>> > >>>>>> needed. Sorry for not responding to you in
>>>>>>>> time.
>>>>>>>> > I'll get my team to fix
>>>>>>>> > >>>>>> this.
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> FYI: Today is holiday in Russia & Ukraine
>>>>>>>> > >>>>>> Thanks,
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> On Thu, Mar 6, 2014 at 12:37 AM, Mike
>>>>>>>> Scherbakov
>>>>>>>> > >>>>>> <mscherbakov@xxxxxxxxxxxx
>>>>>>>> > <mailto:mscherbakov@xxxxxxxxxxxx>> wrote:
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>> It's great. I would be happy to see Lukasz
>>>>>>>> > working on Fuel.
>>>>>>>> > >>>>>>> Actually, Lukasz already doing great job
>>>>>>>> helping
>>>>>>>> > us with Nailgun
>>>>>>>> > >>>>>>> scalability issues resolution.
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>> Dmitry, please arrange meeting between our
>>>>>>>> > Python engineers and
>>>>>>>> > >>>>>>> Lukasz, and identify areas where
>>>>>>>> contribution of
>>>>>>>> > Lukasz will be the most
>>>>>>>> > >>>>>>> effective. It should be aligned with the
>>>>>>>> > development of our engineers too.
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>> Thanks,
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>> On Tue, Mar 4, 2014 at 3:27 PM, Piotr
>>>>>>>> Siwczak
>>>>>>>> > <psiwczak@xxxxxxxxxxxx <mailto:
>>>>>>>> psiwczak@xxxxxxxxxxxx>>
>>>>>>>> > >>>>>>> wrote:
>>>>>>>> > >>>>>>>>
>>>>>>>> > >>>>>>>> Mike,
>>>>>>>> > >>>>>>>>
>>>>>>>> > >>>>>>>> For now I see Lukasz has finished his
>>>>>>>> work for
>>>>>>>> > Softlayer/Express (at
>>>>>>>> > >>>>>>>> least for the next few weeks)and can use
>>>>>>>> his
>>>>>>>> > time to engage into Fuel
>>>>>>>> > >>>>>>>> development. Please feel free to assign
>>>>>>>> him to
>>>>>>>> > Fuel tasks.
>>>>>>>> > >>>>>>>>
>>>>>>>> > >>>>>>>> -Piotr
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>>
>>>>>>>> > >>>>>>> --
>>>>>>>> > >>>>>>> Mike Scherbakov
>>>>>>>> > >>>>>>> #mihgen
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> --
>>>>>>>> > >>>>>> Mike Scherbakov
>>>>>>>> > >>>>>> #mihgen
>>>>>>>> > >>>>>
>>>>>>>> > >>>>>
>>>>>>>> > >>>>
>>>>>>>> > >>>>
>>>>>>>> > >>>>
>>>>>>>> > >>>> --
>>>>>>>> > >>>> Łukasz Oleś
>>>>>>>> > >>>
>>>>>>>> > >>>
>>>>>>>> > >>
>>>>>>>> > >>
>>>>>>>> > >>
>>>>>>>> > >> --
>>>>>>>> > >> Łukasz Oleś
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Łukasz Oleś
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Łukasz Oleś
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > You received this message because you are subscribed to the Google
>>>>>>>> > Groups "fuel-core-team" group.
>>>>>>>> > To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send
>>>>>>>> > an email to fuel-core-team+unsubscribe@xxxxxxxxxxxx
>>>>>>>> > <mailto:fuel-core-team+unsubscribe@xxxxxxxxxxxx>.
>>>>>>>> > For more options, visit
>>>>>>>> https://groups.google.com/a/mirantis.com/d/optout.
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "fuel-core-team" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to fuel-core-team+unsubscribe@xxxxxxxxxxxx.
>>>>>>>> For more options, visit
>>>>>>>> https://groups.google.com/a/mirantis.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Mike Scherbakov
>>>>>>> #mihgen
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "fuel-core-team" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to fuel-core-team+unsubscribe@xxxxxxxxxxxx.
>>>>>> For more options, visit
>>>>>> https://groups.google.com/a/mirantis.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Andrey Danin
>>>>> adanin@xxxxxxxxxxxx
>>>>> skype: gcon.monolake
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Łukasz Oleś
>>>
>>
>>
>>
>> --
>> Mike Scherbakov
>> #mihgen
>>
>
>
--
Łukasz Oleś
Follow ups
References