← Back to team overview

fuel-dev team mailing list archive

Re: network verification (use case)

 

Team,

Can you please check/comment on the new L2/L3 network checker blueprint
filed by Alex Shaposhnikov?
https://blueprints.launchpad.net/fuel/+spec/l23-net-checker

Alex has been a part of multiple OpenStack deployments and brings some real
experience from the field, so we should take his input seriously.

Thanks,
Roman


On Mon, Jan 27, 2014 at 9:56 AM, Dmitry Borodaenko <dborodaenko@xxxxxxxxxxxx
> wrote:

> On Mon, Jan 27, 2014 at 12:51 AM, Andrew Woodward <xarses@xxxxxxxxx>wrote:
>
>> wow, its interesting to know that the L3 isn't tested, in the verify
>> networks task. That is probably part of the problem i'm seeing and IMHO is
>> more important to verify than the VLAN range. We don't currently track
>> gateways, so at the moment we don't know who we can ping during the
>> pre-deployment checks.
>>
>> however my idea is to also add a clause in puppet to force it to fail if
>> it can ping the primary controller, this will cause nodes after the first
>> controller to fail more rapidly (right after or during exec of l23networks
>> code there are already ping hooks there too) this will help spot nodes with
>> errors for which proceeding further is useless anyways.
>>
>
> Typo: I'd assume you meant "fail if it cannot ping the primary controller"
> here.
>
> now with the research we've been doing for multiple L3/L2 domains, we will
>> have to know gateways for each network, so we will have those available to
>> ping in cases that multi-l3 is used.
>>
>> Andrew
>>
>>
>> On Fri, Jan 24, 2014 at 5:30 AM, Miroslav Anashkin <
>> manashkin@xxxxxxxxxxxx> wrote:
>>
>>> My 5c.
>>>
>>> 1. Full network verification may take hours in case there are hundreds
>>> of VLANs configured. So, to avoid timeouts, divide the verification process
>>> into smaller batches. Make timeout configurable - as I remember Cyan case,
>>> their timeouts were 1+ minute to start up and calibrate connection for each
>>> optical NIC.
>>>
>>> 2. Network verification is fully independent feature and even not
>>> mandatory - so, make it background process. Let it show the current
>>> verification progress and report errors as soon as it get some.
>>>
>>> 3. Add feature to allow network verification against selected node or
>>> interface - with priority over common background verification. Make the
>>> green/gray/red network port icon a button.
>>>
>>> 4. 95% of network misconfiguration errors can be found on single node
>>> verification. Let us check network settings against single node first.
>>>
>>> Kind regards,
>>> Miroslav
>>>
>>>
>>> On Fri, Jan 24, 2014 at 2:56 PM, Mike Scherbakov <
>>> mscherbakov@xxxxxxxxxxxx> wrote:
>>>
>>>> Network verification will fail, if any required interfaces are down.
>>>>
>>>> Let's discuss here first any improvements we could do for this feature,
>>>> before creating the blueprint. I'm all in for finding out what we can do
>>>> better here, as network issues looks to be the most frequent thing which
>>>> happens in real deployments.
>>>>
>>>>
>>>> On Fri, Jan 24, 2014 at 1:06 PM, Andrey Danin <adanin@xxxxxxxxxxxx>wrote:
>>>>
>>>>> Gleb, to get interfaces' states from DB you can do this:
>>>>>> http://paste.openstack.org/show/61805/
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 24, 2014 at 4:30 AM, Roman Alekseenkov <
>>>>>> ralekseenkov@xxxxxxxxxxxx> wrote:
>>>>>>
>>>>>>> Gleb - thanks for bringing this up. I like the proposal, actually.
>>>>>>> Whatever makes the life of deployment engineers easier...
>>>>>>>
>>>>>>> Evgeny - the thing you mentioned cannot be a full solution for two
>>>>>>> reasons. The first is scale (nobody will click on each node to check the
>>>>>>> status of its NICs), the second is mass configuration (people are likely to
>>>>>>> configure NICs for multiple nodes at once and again won't go into
>>>>>>> individual nodes). You can imagine how bad it's going to be with 100+
>>>>>>> nodes...
>>>>>>>
>>>>>>> Andrew Woodward and Alex Shaposhnikov also have some specific
>>>>>>> suggestions on how to improve network verification and make it more
>>>>>>> meaningful. Guys - please speak up.
>>>>>>>
>>>>>>> I'd like to see a consolidated blueprint on launchpad from you guys
>>>>>>> (Gleb, Andrew, and Alex), which David and I can take and prioritize.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jan 23, 2014 at 7:50 AM, Mike Scherbakov <
>>>>>>> mscherbakov@xxxxxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> Our verify network feature (activated by corresponding button on
>>>>>>>> networks tab) verifies L2 connectivity of OpenStack networks. It configures
>>>>>>>> desired networking on bootstrap nodes, and runs UDP packets on required
>>>>>>>> interfaces.
>>>>>>>> We also check for unwanted DHCP traffic. There are no checks on L3
>>>>>>>> layer at the moment.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> On Jan 23, 2014 7:10 PM, "Evgeniy L" <eli@xxxxxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>> Hi Gleb,
>>>>>>>>>
>>>>>>>>> Regarding state of interfaces, we have such feature right now.
>>>>>>>>> It was merged and should be available in 4.0 release
>>>>>>>>>
>>>>>>>>> https://github.com/stackforge/fuel-web/commit/0e60ff862d75d8d2ef37d2b0f8d834260f8349b6
>>>>>>>>>
>>>>>>>>> But as far as I know it doesn't work correctly in virtual box.
>>>>>>>>>
>>>>>>>>> [image: Inline image 2]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jan 23, 2014 at 12:05 PM, Gleb Galkin <
>>>>>>>>> ggalkin@xxxxxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hello, all
>>>>>>>>>>
>>>>>>>>>> Right now I have a bunch of nodes and each has 4 network
>>>>>>>>>> interfaces. How can I check that every interface on every node is UP,
>>>>>>>>>> that all switch ports are configured properly and there are no
>>>>>>>>>> connectivity problem?
>>>>>>>>>>
>>>>>>>>>> We have network verification in GUI but it can't provide me with
>>>>>>>>>> detail information about all network issues, can it?
>>>>>>>>>> I'd like to got detail information like
>>>>>>>>>>
>>>>>>>>>> on the node number X interface eth2 (xx:xx:xx:xx:xx:xx) has link
>>>>>>>>>> status 'down'.
>>>>>>>>>> or
>>>>>>>>>> on the node number X interface eth3 (xx:xx:xx:xx:xx:xx) is up but
>>>>>>>>>> it can't ping fuel node
>>>>>>>>>>
>>>>>>>>>> and so on
>>>>>>>>>>
>>>>>>>>>> It's good to have this information BEFORE you press Deploy button.
>>>>>>>>>>
>>>>>>>>>> Maybe we already have something like this? Maybe our network
>>>>>>>>>> verification write some report about the network issues?
>>>>>>>>>>
>>>>>>>>>> If we don't have this feature we should consider it. It'll save a
>>>>>>>>>> lot of time for deployers.
>>>>>>>>>> We can use mcollective to up all network interfaces on all nodes
>>>>>>>>>> and make arping the fuel node or something.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best Regards,
>>>>>>>>>> Gleb Galkin
>>>>>>>>>> OpenStack Deployment Engineer
>>>>>>>>>>
>>>>>>>>>> Mirantis Inc.
>>>>>>>>>> www.mirantis.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>>>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Andrey Danin
>>>>>> adanin@xxxxxxxxxxxx
>>>>>> skype: gcon.monolake
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Mike Scherbakov
>>>> #mihgen
>>>>
>>>> --
>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Kind Regards*
>>>
>>> *Miroslav Anashkin**L2 support engineer**,*
>>> *Mirantis Inc.*
>>> *+7(495)640-4944 <%2B7%28495%29640-4944> (office receptionist)*
>>> *+1(650)587-5200 <%2B1%28650%29587-5200> (office receptionist, call from
>>> US)*
>>> *35b, Bld. 3, Vorontsovskaya St.*
>>> *Moscow**, Russia, 109147.*
>>>
>>> www.mirantis.com
>>>
>>> manashkin@xxxxxxxxxxxx
>>>
>>>
>>> --
>>> Mailing list: https://launchpad.net/~fuel-dev
>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>>
>> --
>> If google has done it, Google did it right!
>>
>> --
>> Mailing list: https://launchpad.net/~fuel-dev
>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~fuel-dev
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Dmitry Borodaenko
>
> --
> Mailing list: https://launchpad.net/~fuel-dev
> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fuel-dev
> More help   : https://help.launchpad.net/ListHelp
>
>

PNG image


Follow ups

References