touch-packages team mailing list archive
-
touch-packages team
-
Mailing list archive
-
Message #129901
[Bug 1536021] Re: [xenial/armhf] lxc-stop --kill hangs forever, container pid 1 in 'D' state
As another data point, I don't see this on the s390x boxes which are
also running xenial userspace with the same LXC setup, but a 4.3 kernel.
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to lxc in Ubuntu.
https://bugs.launchpad.net/bugs/1536021
Title:
[xenial/armhf] lxc-stop --kill hangs forever, container pid 1 in 'D'
state
Status in lxc package in Ubuntu:
New
Bug description:
Since I upgraded our armhf autopkgtest boxes from wily to xenial, I
very often get eternal hangs on lxc-stop:
adt-virt-lxc-egctlo RUNNING 10.0.3.154 - - NO
root 15766 0.0 0.0 5044 1488 ? S Jan19 0:00 lxc-
stop --kill --name adt-virt-lxc-egctlo
I can still attach to the container, and it seems pid1 is in some
"uninterruptible deep kernel sleep":
$ sudo lxc-attach -n adt-virt-lxc-egctlo ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 5060 2344 ? Ds Jan19 0:00 /sbin/init
root 230 0.0 0.1 12112 2224 ? Ss Jan19 0:00 /lib/systemd/systemd-journald
root 263 0.0 0.0 3372 1060 ? Ss Jan19 0:00 /sbin/dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp
root 321 0.0 0.0 4896 912 ? Ss Jan19 0:00 /usr/sbin/cron -f
syslog 329 0.0 0.0 31148 1424 ? Ssl Jan19 0:00 /usr/sbin/rsyslogd -n
message+ 349 0.0 0.0 4860 1540 ? Ss Jan19 0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopid
root 358 0.0 0.0 0 0 ? Zs Jan19 0:00 [systemd-logind] <defunct>
root 384 0.0 0.0 3848 692 pts/3 Ss+ Jan19 0:00 /sbin/agetty --noclear --keep-baud pts/3 115200 38400 9600 vt220
root 386 0.0 0.0 3848 692 pts/0 Ss+ Jan19 0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400 9600 vt220
root 389 0.0 0.0 3848 692 pts/1 Ss+ Jan19 0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600 vt220
root 391 0.0 0.0 0 0 ? Zs Jan19 0:00 [agetty] <defunct>
root 393 0.0 0.0 5064 1028 ? Ss Jan19 0:00 (agetty)
root 4907 0.0 0.0 5652 1176 ? S Jan19 0:00 reboot
root 4917 0.0 0.0 0 0 ? Zs Jan19 0:00 [ondemand] <defunct>
root 5747 0.0 0.0 0 0 ? Z Jan19 0:00 [bash] <defunct>
root 5748 0.0 0.0 0 0 ? Z Jan19 0:00 [bash] <defunct>
root 7168 0.0 0.0 0 0 ? Z Jan19 0:00 [dkms] <defunct>
root 8516 0.0 0.0 0 0 ? Z Jan19 0:00 [dkms] <defunct>
root 21174 0.0 0.0 6788 1304 pts/3 R+ 07:20 0:00 ps aux
journal in the container still works, but does not show anything
interesting. systemctl hangs due to pid1 getting into this 'D' state.
Due to that, stracing pid 1 is also useless.
These boxes are still running the trusty kernel 3.13, as newer kernels
don't boot on those boxes (the block devices are missing, probably a
missing block driver?), so this regression is not due to a kernel
change.
So this is somewhere between lxc, lxcfs, systemd, or cgmanager. I'll
bisect these packages in the next days to find out, as so far I don't
yet have a way to reproduce this reliably.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1536021/+subscriptions
References