yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #85971
[Bug 1926838] [NEW] [OVN] infinite loop in ovsdb_monitor
Public bug reported:
I am running the ovn sandbox, a second chassis, and neutron. I
synchronize neutron database with the databases of the sandbox, run
neutron-server, and possibly run a few ovs-vsctl commands on chassis to
set up ovs ports.
I notice that some commands on the chassis can trigger some sort of
infinite loop in neutron. For example
ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw
ovs-vsctl set open . external-ids:ovn-cms-options=xx
ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw
on the second chassis, will trigger transactions "in a loop" on the
neutron-server:
...
Successfully bumped revision number for resource f32ac6cc (type: ports) to 571
Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): UpdateLRouterPortCommand
Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand
Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 572
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): SetLSwitchPortCommand
Running txn n=1 command(idx=2): PgDelPortCommand
Successfully bumped revision number for resource f32ac6cc (type: ports) to 572
Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): UpdateLRouterPortCommand
Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand
Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 573
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): SetLSwitchPortCommand
Running txn n=1 command(idx=2): PgDelPortCommand
...
This is not limited to the change of external-ids:ovn-cmd-options, other ovs-vsctl commands can trigger the same issue.
neutron-server CPU consumption jumps to 100% and the revision_number of
ports keep increasing. Restarting neutron-server fixes the issue
temporarily.
I am not sure how to provide a simple reproducer because I did not found
any instructions to run neutron standalone and two OVN chassis. I will
investigate what is happening locally.
Version: main branch from OVN (d41a337fe3b608a8f90de8722d148344011f0bd8)
and of Neutron (94d36862c207b1e4d984d28874ca2f3bd09c855f)
It's not a blocker as long as it happens only on my laptop.
** Affects: neutron
Importance: Undecided
Status: New
** Tags: ovn
** Attachment added: "logs of one loop"
https://bugs.launchpad.net/bugs/1926838/+attachment/5494052/+files/logs1
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1926838
Title:
[OVN] infinite loop in ovsdb_monitor
Status in neutron:
New
Bug description:
I am running the ovn sandbox, a second chassis, and neutron. I
synchronize neutron database with the databases of the sandbox, run
neutron-server, and possibly run a few ovs-vsctl commands on chassis
to set up ovs ports.
I notice that some commands on the chassis can trigger some sort of
infinite loop in neutron. For example
ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw
ovs-vsctl set open . external-ids:ovn-cms-options=xx
ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw
on the second chassis, will trigger transactions "in a loop" on the
neutron-server:
...
Successfully bumped revision number for resource f32ac6cc (type: ports) to 571
Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): UpdateLRouterPortCommand
Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand
Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 572
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): SetLSwitchPortCommand
Running txn n=1 command(idx=2): PgDelPortCommand
Successfully bumped revision number for resource f32ac6cc (type: ports) to 572
Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): UpdateLRouterPortCommand
Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand
Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 573
Running txn n=1 command(idx=0): CheckRevisionNumberCommand
Running txn n=1 command(idx=1): SetLSwitchPortCommand
Running txn n=1 command(idx=2): PgDelPortCommand
...
This is not limited to the change of external-ids:ovn-cmd-options, other ovs-vsctl commands can trigger the same issue.
neutron-server CPU consumption jumps to 100% and the revision_number
of ports keep increasing. Restarting neutron-server fixes the issue
temporarily.
I am not sure how to provide a simple reproducer because I did not
found any instructions to run neutron standalone and two OVN chassis.
I will investigate what is happening locally.
Version: main branch from OVN
(d41a337fe3b608a8f90de8722d148344011f0bd8) and of Neutron
(94d36862c207b1e4d984d28874ca2f3bd09c855f)
It's not a blocker as long as it happens only on my laptop.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1926838/+subscriptions