fuel-dev team mailing list archive
-
fuel-dev team
-
Mailing list archive
-
Message #00534
Re: Bonding problems
After 14 hours of a flood ping a hardware lab lost few packets and virtual
env lost hundreds of packets. Mode: balance-slb.
I'm going to test LACP behaviour today.
On Tue, Feb 25, 2014 at 3:50 AM, Andrey Danin <adanin@xxxxxxxxxxxx> wrote:
> Fine. They wrote about that in the documentation too:
> http://openvswitch.org/ovs-vswitchd.conf.db.5.pdf page 14 It was
> introduced two years ago since version 1.5.0. One problem less!
>
>
> On Tue, Feb 25, 2014 at 3:37 AM, Ryan Moe <rmoe@xxxxxxxxxxxx> wrote:
>
>> Andrey is correct. It appears that balance-tcp requires successful LACP
>> negotiation. See here:
>> https://github.com/osrg/openvswitch/blob/master/lib/bond.c#L610 and
>> here: https://github.com/osrg/openvswitch/blob/master/lib/bond.c#L1438.
>> This also means that when we create bonds with balance-tcp we need to
>> configure lacp as well.
>>
>>
>> On Mon, Feb 24, 2014 at 3:14 PM, Andrey Danin <adanin@xxxxxxxxxxxx>wrote:
>>
>>> And yes, the bug https://bugs.launchpad.net/fuel/+bug/1272842 and
>>> current problem can be unrelated but they have similar error messages in
>>> OVS logs.
>>>
>>>
>>> On Tue, Feb 25, 2014 at 2:55 AM, Andrey Danin <adanin@xxxxxxxxxxxx>wrote:
>>>
>>>> Guys, I set up hardware (2 nodes) and software (3 nodes) labs today
>>>> with ISO #181 to test bonding. Unfortunately, balance-tcp mode is totally
>>>> broken. When I use it during deployment or switch to it in a working
>>>> cluster, all traffic stops. Playing with rebalance interval doesn't help.
>>>> On the contrary, balance-slb works fine. Both Ubuntu (Hhardware nodes)
>>>> and CentOS (virtual env) works without any traffic lost. I'm running a
>>>> flooded ping between virtual instances inside of clouds for a night and
>>>> will check a number of lost packets. Also I want to play with iperf.
>>>>
>>>> Next things we can do:
>>>> * Build an ISO with stable (1.9.3) or newest (2.0.x) version of OVS and
>>>> play with them. Yesterday we decided to build Ubuntu 12.04 with Debian Sid
>>>> 1.9.3 version of OVS. There is the ticket about that
>>>> https://mirantis.jira.com/browse/OSCI-1089 Also Igor built its own
>>>> version of an ISO with Sid package.
>>>> * Dump openflow rules in balance-tcp mode and try to fix them. It's
>>>> hard to do that because Aliens developed their syntax.
>>>> * Run Igor's tests again and again until balance-slb starts block a
>>>> traffic. Then dig into openflow rules.
>>>> * Play with LACP on a real hardware. Maybe balance-tcp can be used only
>>>> with lacp=active.
>>>> * Ask the openvswitch community about our problems.
>>>>
>>>> Andrew, yes, the PXE network still nailed to an interface. I hope we
>>>> will fix it in 5.0.
>>>>
>>>>
>>>> On Tue, Feb 25, 2014 at 12:20 AM, Igor Shishkin <ishishkin@xxxxxxxxxxxx
>>>> > wrote:
>>>>
>>>>> Hello, Dmitry.
>>>>>
>>>>> It’s 100% reproducible on virtual environment when we’re trying to
>>>>> deploy bonding in balance tcp or balance slb mode.
>>>>> Tests related as a way to reproduce and a warning why these tests
>>>>> should fail when they’ll be merged.
>>>>>
>>>>> As we can see problem is in rebalance procedure openvswitch tries to
>>>>> do since it started bonded interface. And in this time bonded interfaces
>>>>> stops to accept ARPs.
>>>>>
>>>>> I just built openvswitch=1.9.3 which is LTS and wanna try it in the
>>>>> same case and try to descrease bond-rebalance-interval to 0(as Andrey K.
>>>>> suggested). If any of this will help - this could be the solution(but I'm
>>>>> really not sure bond-rebalance-interval=0 is a good way).
>>>>> —
>>>>> Igor Shishkin
>>>>> QA Engineer
>>>>>
>>>>>
>>>>>
>>>>> On 24 Feb 2014, at 23:59, Dmitry Borodaenko <dborodaenko@xxxxxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>> > Mike, Igor,
>>>>> >
>>>>> > Can you provide more details on how the integration test in review
>>>>> > #75161 helps to reproduce bug #1272842?
>>>>> >
>>>>> > As far as I understand, the bug is a highly intermittent problem with
>>>>> > ARP that was only showing up after an environment with LACP bonding
>>>>> > was operational for at least a few hours.
>>>>> >
>>>>> > On the other hand, the problem Igor is reporting based on the
>>>>> > integration test sounds like something 100% reproducible that doesn't
>>>>> > require real hardware or LACP and is not necessarily related to ARP.
>>>>> >
>>>>> > Are you sure you're not confusing two unrelated problems?
>>>>> >
>>>>> > Thanks,
>>>>> > -DmitryB
>>>>> >
>>>>> >
>>>>> > On Mon, Feb 24, 2014 at 9:18 AM, Mike Scherbakov
>>>>> > <mscherbakov@xxxxxxxxxxxx> wrote:
>>>>> >> The issue is here: https://bugs.launchpad.net/fuel/+bug/1272842.
>>>>> >> Those who know what can be wrong with our openvswitch/kernel,
>>>>> please provide
>>>>> >> your input..
>>>>> >>
>>>>> >>
>>>>> >> On Mon, Feb 24, 2014 at 9:04 PM, Igor Shishkin <
>>>>> ishishkin@xxxxxxxxxxxx>
>>>>> >> wrote:
>>>>> >>>
>>>>> >>> Hello,
>>>>> >>>
>>>>> >>> Currently we have this review
>>>>> https://review.openstack.org/#/c/75161 with
>>>>> >>> test cases for our brand new shiny bonding feature but
>>>>> >>> balance-tcp/balance-slb modes are not working for now.
>>>>> >>>
>>>>> >>> Steps to reproduce are very simple:
>>>>> >>> Create cluster with simple or HA configuration, select balance-tcp
>>>>> or
>>>>> >>> balance-slb bonding mode and start deployment.
>>>>> >>>
>>>>> >>> Deployment will not finish with success because of rebalance
>>>>> procedure
>>>>> >>> problems.
>>>>> >>> --
>>>>> >>> Igor Shishkin
>>>>> >>> QA Engineer
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Mike Scherbakov
>>>>> >> #mihgen
>>>>> >>
>>>>> >> --
>>>>> >> Mailing list: https://launchpad.net/~fuel-dev
>>>>> >> Post to : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>> >> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>> >> More help : https://help.launchpad.net/ListHelp
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Dmitry Borodaenko
>>>>>
>>>>>
>>>>> --
>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>> Post to : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>> More help : https://help.launchpad.net/ListHelp
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Andrey Danin
>>>> adanin@xxxxxxxxxxxx
>>>> skype: gcon.monolake
>>>>
>>>
>>>
>>>
>>> --
>>> Andrey Danin
>>> adanin@xxxxxxxxxxxx
>>> skype: gcon.monolake
>>>
>>> --
>>> Mailing list: https://launchpad.net/~fuel-dev
>>> Post to : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>
>
> --
> Andrey Danin
> adanin@xxxxxxxxxxxx
> skype: gcon.monolake
>
--
Andrey Danin
adanin@xxxxxxxxxxxx
skype: gcon.monolake
Follow ups
References