kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #154782
[Bug 1505564] Re: Soft lockup with "block nbdX: Attempted send on closed socket" spam
Ok, nm about the sosreport - I got the info from some older emails from
axino, nova is using qemu-nbd to locally mount images and access the
partitions inside them. I was able to trivially reproduce this simply
by creating an image, attaching it with qemu-nbd to /dev/nbd0,
partitioning it and mkfs its p1 and then mounting it, then while copying
a file to it, performing qemu-nbd -d to un-attach it to /dev/nbd0. That
causes the spam of "Attempted..." error messages.
So this appears to be a simple case of nova calling qemu-nbd -d while
there is still I/O to the image. The right thing to do is simply
ratelimit the error messages (and they really should be anyway, as
they're printing directly inside a loop). The messages themselves do
not indicate any kernel error, simply that the nbd device was removed
while being written to.
Can you try this kernel PPA to see if it fixes the problem? You will
still see the error messages, but only a few lines since they'll be
ratelimited.
Of course there is still the (probably more serious) problem of the serial port driver hanging a cpu and eating up memory; that probably deserves its own bug, since it's caused by this, but a separate issue.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1505564
Title:
Soft lockup with "block nbdX: Attempted send on closed socket" spam
Status in linux package in Ubuntu:
In Progress
Bug description:
Some of our nova compute hosts regularly freeze, sometimes for a few
hours, with kern.log getting spammed with:
block nbdX: Attempted send on closed socket
and a few "CPU soft lockup" messages (see attached log). This clears
up when the queue gets cleared, eg :
block nbdX: queue cleared
trusty hosts with kernel version 3.19.0-30-generic.
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Nov 24 12:23 seq
crw-rw---- 1 root audio 116, 33 Nov 24 12:23 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 14.04
IwConfig: Error: [Errno 2] No such file or directory
MachineType: HP ProLiant DL385 G7
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=screen-256color
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-36-generic root=UUID=13289ac9-8dc9-4feb-b6bd-ca7db66b21d6 ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M nox2apic intremap=off
ProcVersionSignature: Ubuntu 3.19.0-36.41~14.04.1hf00090138v20151122b1-generic 3.19.8-ckt9
RelatedPackageVersions:
linux-restricted-modules-3.19.0-36-generic N/A
linux-backports-modules-3.19.0-36-generic N/A
linux-firmware 1.127.18
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty uec-images
Uname: Linux 3.19.0-36-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
dmi.bios.date: 02/02/2014
dmi.bios.vendor: HP
dmi.bios.version: A18
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrA18:bd02/02/2014:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr:
dmi.product.name: ProLiant DL385 G7
dmi.sys.vendor: HP
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1505564/+subscriptions
References