← Back to team overview

kernel-packages team mailing list archive

[Bug 585657] Re: Transfering large files to nfs mount causes system freeze

 

Same problem with Ubuntu Trusty "Ubuntu 14.04.1 LTS" / kernel 3.13.0-32-generic.
Both NFS client and server run this version of Ubuntu and kernel version. Trying to transfer 500GB files from mdadm raid5 to NFS using lbzip2:

[48693.533918] INFO: task lbzip2:14344 blocked for more than 120 seconds.
[48693.536784]       Not tainted 3.13.0-32-generic #57-Ubuntu
[48693.539750] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[48693.542764] lbzip2          D ffff88011fc94440     0 14344  14341 0x00000000
[48693.542773]  ffff8800d235bb38 0000000000000002 ffff8801194497f0 ffff8800d235bfd8
[48693.542782]  0000000000014440 0000000000014440 ffff8801194497f0 ffff88011fc94cd8
[48693.542789]  ffff88011ffd3f28 0000000000000002 ffffffffa0219fe0 ffff8800d235bbb0
[48693.542795] Call Trace:
[48693.542838]  [<ffffffffa0219fe0>] ? nfs_free_request+0xb0/0xb0 [nfs]
[48693.542851]  [<ffffffff817203fd>] io_schedule+0x9d/0x140
[48693.542877]  [<ffffffffa0219fee>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
[48693.542884]  [<ffffffff81720882>] __wait_on_bit+0x62/0x90
[48693.542908]  [<ffffffffa0219fe0>] ? nfs_free_request+0xb0/0xb0 [nfs]
[48693.542917]  [<ffffffff81720927>] out_of_line_wait_on_bit+0x77/0x90
[48693.542926]  [<ffffffff810aaf40>] ? autoremove_wake_function+0x40/0x40
[48693.542948]  [<ffffffffa021a383>] nfs_wait_on_request+0x33/0x40 [nfs]
[48693.542971]  [<ffffffffa021f2d0>] nfs_updatepage+0x150/0x650 [nfs]
[48693.542991]  [<ffffffffa021096b>] nfs_write_end+0x5b/0x340 [nfs]
[48693.543000]  [<ffffffff8114e616>] generic_file_buffered_write+0x156/0x250
[48693.543009]  [<ffffffff8114fc81>] __generic_file_aio_write+0x1c1/0x3d0
[48693.543016]  [<ffffffff8114fee8>] generic_file_aio_write+0x58/0xa0
[48693.543036]  [<ffffffffa020fbdb>] nfs_file_write+0xbb/0x1d0 [nfs]
[48693.543043]  [<ffffffff811bc3da>] do_sync_write+0x5a/0x90
[48693.543050]  [<ffffffff811bcb64>] vfs_write+0xb4/0x1f0
[48693.543056]  [<ffffffff811bd599>] SyS_write+0x49/0xa0
[48693.543063]  [<ffffffff8172c87f>] tracesys+0xe1/0xe6

NFS client hardware Dell T605 server, NFS server HP Proliant ML150 G2

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/585657

Title:
  Transfering large files to nfs mount causes system freeze

Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Lucid:
  Fix Released
Status in “linux” source package in Maverick:
  Fix Released
Status in “linux” source package in Natty:
  Fix Released
Status in “linux” source package in Hardy:
  Fix Released

Bug description:
  Binary package hint: nfs-kernel-server

  I have verified this bug on both karmic and lucid on both the server
  and client:

  -------------------------------------------------------------------------------

  Description:	Ubuntu 9.10
  Release:	9.10

  nfs-common:
    Installed: 1:1.2.0-2ubuntu8

  nfs-kernel-server:
    Installed: 1:1.2.0-2ubuntu8

  portmap:
    Installed: 6.0-10ubuntu2

  -------------------------------------------------------------------------------

  Description:	Ubuntu 10.04 LTS
  Release:	10.04

  nfs-common:
    Installed: 1:1.2.0-4ubuntu4

  nfs-kernel-server:
    Installed: 1:1.2.0-4ubuntu4

  portmap:
    Installed: 6.0.0-1ubuntu2

  -------------------------------------------------------------------------------

  Expected behavior:

  Copying large files from local directories to an nfs mounted directory
  should complete without error.

  -------------------------------------------------------------------------------

  Actual behavior:

  The system freezes while trying to copy large files from a local
  directory (e.g. /tmp) to an nfs mounted directory. This causes various
  things to fail to respond, ultimately resulting in a hard reboot and
  potential loss of data. When this occurs I am able to log into the box
  via ssh, but even sudo is unable to kill -9 the wayward file copy or
  reboot the machine gracefully.

  -------------------------------------------------------------------------------

  Details:

  The server exports several directories, for example:

  /home/shared
  /home/user1/Documents
  /home/user1/Development

  The client mounts these as follows:

  server1:/home/shared    /home/shared    nfs rw,soft,intr 0 0
  server1:/home/user1/Development /home/server1/user1/Development nfs rw,soft,intr 0 0
  server1:/home/user1/Documents   /home/server1/user1/Documents   nfs rw,soft,intr 0 0

  I see lots of messages like this in /var/log/syslog:

  May 22 10:44:31 client1 kernel: [ 1680.390484] INFO: task cp:2791 blocked for more than 120 seconds.
  May 22 10:44:31 client1 kernel: [ 1680.390488] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  May 22 10:44:31 client1 kernel: [ 1680.390492] cp D 00000000ffffffff 0 2791 2503 0x00000000
  May 22 10:44:31 client1 kernel: [ 1680.390501] ffff88012a457c48 0000000000000082 0000000000015bc0 0000000000015bc0
  May 22 10:44:31 client1 kernel: [ 1680.390508] ffff8801291331a0 ffff88012a457fd8 0000000000015bc0 ffff880129132de0
  May 22 10:44:31 client1 kernel: [ 1680.390516] 0000000000015bc0 ffff88012a457fd8 0000000000015bc0 ffff8801291331a0
  May 22 10:44:31 client1 kernel: [ 1680.390523] Call Trace:
  May 22 10:44:31 client1 kernel: [ 1680.390545] [<ffffffffa0cff2b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390552] [<ffffffff8153eb87>] io_schedule+0x47/0x70
  May 22 10:44:31 client1 kernel: [ 1680.390573] [<ffffffffa0cff2be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390579] [<ffffffff8153f3df>] __wait_on_bit+0x5f/0x90
  May 22 10:44:31 client1 kernel: [ 1680.390587] [<ffffffff812b6234>] ? __lookup_tag+0x64/0x120
  May 22 10:44:31 client1 kernel: [ 1680.390608] [<ffffffffa0cff2b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390615] [<ffffffff8153f488>] out_of_line_wait_on_bit+0x78/0x90
  May 22 10:44:31 client1 kernel: [ 1680.390622] [<ffffffff81085360>] ? wake_bit_function+0x0/0x40
  May 22 10:44:31 client1 kernel: [ 1680.390643] [<ffffffffa0cff29f>] nfs_wait_on_request+0x2f/0x40 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390665] [<ffffffffa0d036af>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390688] [<ffffffffa0d04aee>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390711] [<ffffffffa0d04ed9>] nfs_write_mapping+0x79/0xb0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390733] [<ffffffffa0d04f47>] nfs_wb_all+0x17/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390751] [<ffffffffa0cf3eba>] nfs_do_fsync+0x2a/0x60 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390770] [<ffffffffa0cf4105>] nfs_file_flush+0x75/0xa0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390777] [<ffffffff8114051c>] filp_close+0x3c/0x90
  May 22 10:44:31 client1 kernel: [ 1680.390783] [<ffffffff81140627>] sys_close+0xb7/0x120
  May 22 10:44:31 client1 kernel: [ 1680.390790] [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/585657/+subscriptions