← Back to team overview

kernel-packages team mailing list archive

[Bug 1046285] Re: NFS client hang with lots of simultaneous operations

 

Mark Thompson, this bug report is being closed due to your last comment
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1046285/comments/21
regarding this being fixed with an update. For future reference you can
manage the status of your own bugs by clicking on the current status in
the yellow line and then choosing a new status in the revealed drop down
box. You can learn more about bug statuses at
https://wiki.ubuntu.com/Bugs/Status. Thank you again for taking the time
to report this bug and helping to make Ubuntu better. Please submit any
future bugs you may find.

** Changed in: linux (Ubuntu)
       Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1046285

Title:
  NFS client hang with lots of simultaneous operations

Status in “linux” package in Ubuntu:
  Invalid

Bug description:
  When lots of simultaneous NFS operations from different processes are
  happening, sometimes all of the processes get stuck in kernel space
  (uninterruptible sleep) and make no forward progress.  The network
  connection is not the problem (the NFS server is still talkable to by
  other means - ping, ssh).  This has happened to me five times in the
  last few weeks (four times randomly and once when trying to reproduce
  it), since upgrading to 12.04 (it never happened here on 10.04).

  The processes which are stuck can be killed with SIGKILL, but to any
  normal means are totally unresponsive.  Attmpting to talk to the NFS
  mount from a nonstuck process will immediately get that one stuck as
  well.  The problem can be "fixed" by sending SIGKILL to all stuck
  processes (being careful not to create any more - if one of the stuck
  processes was running from a binary on the NFS mount then ps can hang
  too as it tries to stat it) and unmounting the filesystem.  After
  remounting, everything works as expected again.  With an NFS home
  directory (my random failure case), this basically means that that one
  user is totally stuck (any access to their home directory hangs the
  process which does it) and has to have all their processes killed by
  root to bring the machine back to a working state.

  The kernel log doesn't mention anything at all, but magic sysrq 'w'
  was able to extract stack traces of all the blocked processes (see
  attached) - they are all stuck in NFS-related RPC calls.

  In general it has happened while building a large source tree on an
  NFS mount, with many forked processes all competing to talk to the
  filesystem at the same time.  There are no special mount options -
  it's just a vanilla v3 NFS mount with 'rw' set.  I was able to
  reproduce it once by this method - it happened after leaving six
  eight-way-forked builds (repeatedly cleaning and building their tree)
  going for several hours (none of the random failures had use anything
  like this heavy at the time, though, as far as I can tell).

  This problem looks similar: http://www.spinics.net/lists/linux-nfs/msg32318.html .  However, I don't know enough about the NFS internals to say that it is the same.  The script suggested there to reproduce that problem does not fail for me in an hour of running.
  --- 
  AcpiTables: Error: command ['sudo', 'LC_MESSAGES=C', 'LANGUAGE=', '/usr/share/apport/dump_acpi_tables.py'] failed with exit code 1: mrt is not in the sudoers file.  This incident will be reported.
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.0.1-0ubuntu12
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/controlC2', '/dev/snd/pcmC2D0c', '/dev/snd/by-id', '/dev/snd/controlC0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/by-path', '/dev/snd/controlC1', '/dev/snd/hwC1D0', '/dev/snd/hwC1D3', '/dev/snd/pcmC1D0c', '/dev/snd/pcmC1D0p', '/dev/snd/pcmC1D1p', '/dev/snd/pcmC1D2c', '/dev/snd/pcmC1D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  Card0.Amixer.info: Error: [Errno 2] No such file or directory
  Card0.Amixer.values: Error: [Errno 2] No such file or directory
  Card1.Amixer.info: Error: [Errno 2] No such file or directory
  Card1.Amixer.values: Error: [Errno 2] No such file or directory
  Card2.Amixer.info: Error: [Errno 2] No such file or directory
  Card2.Amixer.values: Error: [Errno 2] No such file or directory
  CurrentDmesg:
   Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied
   dmesg: write failed: Broken pipe
  DistroRelease: Ubuntu 12.04
  IwConfig: Error: [Errno 2] No such file or directory
  MachineType: Dell Inc. Studio XPS 8100
  Package: linux (not installed)
  ProcEnviron:
   SHELL=/bin/bash
   TERM=xterm
   PATH=(custom, no user)
   LANG=en_GB.UTF-8
  ProcFB: 0 nouveaufb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-29-generic root=UUID=6dbbcb79-2948-4430-8d25-5479f8831106 ro
  ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  precise
  Uname: Linux 3.2.0-29-generic x86_64
  UpgradeStatus: Upgraded to precise on 2012-06-15 (82 days ago)
  UserGroups: dialout video
  WifiSyslog:
   
  dmi.bios.date: 12/09/2009
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: A03
  dmi.board.name: 0T568R
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A00
  dmi.chassis.type: 3
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: dmi:bvnDellInc.:bvrA03:bd12/09/2009:svnDellInc.:pnStudioXPS8100:pvr:rvnDellInc.:rn0T568R:rvrA00:cvnDellInc.:ct3:cvr:
  dmi.product.name: Studio XPS 8100
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1046285/+subscriptions