← Back to team overview

kernel-packages team mailing list archive

[Bug 1505564] Re: Soft lockup with "block nbdX: Attempted send on closed socket" spam

 

> Overall, as far as nova logs show, there is 0 write on the nbd device and very few reads (probably just the MBR ?). 
> Could that still cause inflight I/O when qemu-nbd -d is ran ?

"very few" > 0
:-)

and it could be coming from elsewhere...but we don't need to account for
where the IO is coming from, as the simple fact that it's there is
enough.  Also it's not just data IO, it's any "request", including
metadata/control requests.  Network-backed devices can disappear at any
time, and the driver must be able to handle that.  Spamming endless
messages to the log isn't a good idea in that case.

To clarify the exact code in this situation:

while ((req = blk_fetch_request(q)) != NULL) {
...
		if (unlikely(!nbd->sock)) {
                        dev_err(disk_to_dev(nbd->disk), "Attempted send on closed socket\n");
...
                        continue;
        	}

so, as soon as the connection (socket) is gone, there will be an
"Attempted..." message printed for every request in the queue, as the
queue is cleared.

> I'll happily test your kernel PPA, but as far as I can see, you don't
mention where it actually is :)

ha, forgot to paste it in, sorry :-)

https://launchpad.net/~ddstreet/+archive/ubuntu/lp1505564

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1505564

Title:
  Soft lockup with "block nbdX: Attempted send on closed socket" spam

Status in linux package in Ubuntu:
  In Progress

Bug description:
  Some of our nova compute hosts regularly freeze, sometimes for a few
  hours, with kern.log getting spammed with:

  block nbdX: Attempted send on closed socket

  and a few "CPU soft lockup" messages (see attached log). This clears
  up when the queue gets cleared, eg :

  block nbdX: queue cleared

  trusty hosts with kernel version 3.19.0-30-generic.
  --- 
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Nov 24 12:23 seq
   crw-rw---- 1 root audio 116, 33 Nov 24 12:23 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.14.1-0ubuntu3.19
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  DistroRelease: Ubuntu 14.04
  IwConfig: Error: [Errno 2] No such file or directory
  MachineType: HP ProLiant DL385 G7
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 radeondrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-36-generic root=UUID=13289ac9-8dc9-4feb-b6bd-ca7db66b21d6 ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M nox2apic intremap=off
  ProcVersionSignature: Ubuntu 3.19.0-36.41~14.04.1hf00090138v20151122b1-generic 3.19.8-ckt9
  RelatedPackageVersions:
   linux-restricted-modules-3.19.0-36-generic N/A
   linux-backports-modules-3.19.0-36-generic  N/A
   linux-firmware                             1.127.18
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty uec-images
  Uname: Linux 3.19.0-36-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  dmi.bios.date: 02/02/2014
  dmi.bios.vendor: HP
  dmi.bios.version: A18
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: dmi:bvnHP:bvrA18:bd02/02/2014:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr:
  dmi.product.name: ProLiant DL385 G7
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1505564/+subscriptions


References