kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #125286
[Bug 1466135] Re: nf_conntrack releases a conntrack with non-zero refcnt
Apologies for the delay on this, I've been travelling. Thanks Chris &
others for following up. We've locally cherry-picked this patch and
confirmed it fixes our issue, happy to test again with an official deb
if someone can point me at that.
Local reproduction instructions:
- Install Ubuntu 14.04.[01] (kernel 3.13.0-40-generic)
- Get docker image that includes OVS dependencies
- Build openvswitch from https://github.com/justinpettit/ovs/tree/conntrack
- Instructions to build here: https://github.com/justinpettit/ovs/blob/conntrack/INSTALL.Debian.md
- Install and load openvswitch module on host. (dpkg -i *.deb, modprobe openvswitch)
In one shell:
# ip addr add dev docker0 192.168.0.2/24; ping 192.168.0.1
(leave running)
In another shell, assumes $PWD contains openvswitch debs and repro script from below:
$ docker run -i -t --entrypoint=bash --privileged=true -v $PWD:/host <docker image with OVS deps>
$ cd /host; ./repro.sh 192.168.0.1
(wait until first shell shows that pings are flowing)
$ ovs-ofctl dump-flows br0
(should show two flows, each which are getting traffic. One has actions=ct(commit,recirc))
$ conntrack -L
(Optional; can see the ICMP connection listed)
Now:
- Press Ctrl+D to exit the container. It is a little slow to exit.
- Subsequent container starts or "ip netns add foo" will hang.
$ cat repro.sh
#!/bin/bash
IP=$1
cd /host
dpkg -i openvswitch-common*deb openvswitch-switch*deb
service openvswitch-switch restart
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ip link set dev br0 up
ip addr add dev br0 $IP/24
ip addr
ovs-ofctl add-flow br0 "conn_state=-trk,ip actions=ct(commit,recirc)"
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1466135
Title:
nf_conntrack releases a conntrack with non-zero refcnt
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Trusty:
Fix Committed
Bug description:
[Impact]
Occasionally starting new containers or creating new net namespaces may soft lockup because of improper refcounting of conntrack entires.
In the issue that I face, I can find a kworker thread using up an
entire core, and when I cat /proc/$pid/stack I see this:
<ffffffffbe01e9b6>] ___preempt_schedule+0x56/0xb0
[<ffffffffc02223e4>] nf_ct_iterate_cleanup+0x134/0x160 [nf_conntrack]
[<ffffffffc0223dae>] nf_conntrack_cleanup_net_list+0x4e/0x170
[nf_conntrack]
[<ffffffffc022436d>] nf_conntrack_pernet_exit+0x4d/0x60 [nf_conntrack]
[<ffffffffbe6040d3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffffbe6048d0>] cleanup_net+0x100/0x1d0
[<ffffffffbe084991>] process_one_work+0x171/0x470
[<ffffffffbe08563b>] worker_thread+0x11b/0x3a0
[<ffffffffbe08bb82>] kthread+0xd2/0xf0
[<ffffffffbe71757c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff
The kworker is looping forever and failing to clean up conntrack state.
All the while, it holds the global netns lock. Given that I've bisected
to commit e53376bef2cd97d3e3f61fdc677fb8da7d03d0da which is to do with refcounting, I suspect that borked refcounting on conntrack entries makes them impossible to properly free/destroy, which prevents this worker from cleaning up the namespace, which then goes on to prevent anything else from interacting with namespaces (add/delete/etc).
[Test Case]
bug 1403152 has a testcase which can occasionally hit this issue
[Fix]
$ git describe --contains e53376bef2cd97d3e3f61fdc677fb8da7d03d0da
v3.14-rc3~36^2~28^2~12
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1466135/+subscriptions
References