← Back to team overview

kernel-packages team mailing list archive

[Bug 1505564] Re: Soft lockup with "block nbdX: Attempted send on closed socket" spam

 

Ok, nm about the sosreport - I got the info from some older emails from
axino, nova is using qemu-nbd to locally mount images and access the
partitions inside them.  I was able to trivially reproduce this simply
by creating an image, attaching it with qemu-nbd to /dev/nbd0,
partitioning it and mkfs its p1 and then mounting it, then while copying
a file to it, performing qemu-nbd -d to un-attach it to /dev/nbd0.  That
causes the spam of "Attempted..." error messages.

So this appears to be a simple case of nova calling qemu-nbd -d while
there is still I/O to the image.  The right thing to do is simply
ratelimit the error messages (and they really should be anyway, as
they're printing directly inside a loop).  The messages themselves do
not indicate any kernel error, simply that the nbd device was removed
while being written to.

Can you try this kernel PPA to see if it fixes the problem?  You will
still see the error messages, but only a few lines since they'll be
ratelimited.


Of course there is still the (probably more serious) problem of the serial port driver hanging a cpu and eating up memory; that probably deserves its own bug, since it's caused by this, but a separate issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1505564

Title:
  Soft lockup with "block nbdX: Attempted send on closed socket" spam

Status in linux package in Ubuntu:
  In Progress

Bug description:
  Some of our nova compute hosts regularly freeze, sometimes for a few
  hours, with kern.log getting spammed with:

  block nbdX: Attempted send on closed socket

  and a few "CPU soft lockup" messages (see attached log). This clears
  up when the queue gets cleared, eg :

  block nbdX: queue cleared

  trusty hosts with kernel version 3.19.0-30-generic.
  --- 
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Nov 24 12:23 seq
   crw-rw---- 1 root audio 116, 33 Nov 24 12:23 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.14.1-0ubuntu3.19
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  DistroRelease: Ubuntu 14.04
  IwConfig: Error: [Errno 2] No such file or directory
  MachineType: HP ProLiant DL385 G7
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 radeondrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-36-generic root=UUID=13289ac9-8dc9-4feb-b6bd-ca7db66b21d6 ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M nox2apic intremap=off
  ProcVersionSignature: Ubuntu 3.19.0-36.41~14.04.1hf00090138v20151122b1-generic 3.19.8-ckt9
  RelatedPackageVersions:
   linux-restricted-modules-3.19.0-36-generic N/A
   linux-backports-modules-3.19.0-36-generic  N/A
   linux-firmware                             1.127.18
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty uec-images
  Uname: Linux 3.19.0-36-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  dmi.bios.date: 02/02/2014
  dmi.bios.vendor: HP
  dmi.bios.version: A18
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: dmi:bvnHP:bvrA18:bd02/02/2014:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr:
  dmi.product.name: ProLiant DL385 G7
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1505564/+subscriptions


References