← Back to team overview

kernel-packages team mailing list archive

[Bug 585657] Re: Transfering large files to nfs mount causes system freeze

 

same lockups when copying / rsync 200MB files TO FreeNas NFS server,
didn't have problems reading.

Gigabyte Brix using Asix ax88179_178a Gigabit USB3 nic, tried a second
nic with same result.

Kernel: 3.17.4-031704-generic (mainline)

Xubuntu Trusty 14.0.1

ii  nfs-kernel-server                           1:1.2.8-6ubuntu1.1         
ii  nfs-common                                  1:1.2.8-6ubuntu1.1                

22699-Jan 14 14:08:44 brix kernel: [  360.548908] INFO: task cp:5165 blocked for more than 120 seconds.
22700-Jan 14 14:08:44 brix kernel: [  360.548912]       Tainted: G           OE  3.17.4-031704-generic #201411211317
22701:Jan 14 14:08:44 brix kernel: [  360.548913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
22702-Jan 14 14:08:44 brix kernel: [  360.548914] cp              D 0000000000000007     0  5165      1 0x00000004
22703-Jan 14 14:08:44 brix kernel: [  360.548917]  ffff8801eb9b7c88 0000000000000086 ffff8801eb9b7c28 ffffffff8101e5c9
22704-Jan 14 14:08:44 brix kernel: [  360.548919]  ffff8801eb9b7fd8 00000000000145c0 ffff8800bbeee200 00000000000145c0
22705-Jan 14 14:08:44 brix kernel: [  360.548920]  ffff880213235000 ffff88003643e400 ffff8801eb9b7c88 ffff88021ebd4ec0
22706-Jan 14 14:08:44 brix kernel: [  360.548922] Call Trace:
22707-Jan 14 14:08:44 brix kernel: [  360.548928]  [<ffffffff8101e5c9>] ? read_tsc+0x9/0x10
22708-Jan 14 14:08:44 brix kernel: [  360.548932]  [<ffffffff817a2970>] ? bit_wait+0x50/0x50
22709-Jan 14 14:08:44 brix kernel: [  360.548933]  [<ffffffff817a20c9>] schedule+0x29/0x70
22710-Jan 14 14:08:44 brix kernel: [  360.548935]  [<ffffffff817a219f>] io_schedule+0x8f/0xd0
22711-Jan 14 14:08:44 brix kernel: [  360.548937]  [<ffffffff817a299b>] bit_wait_io+0x2b/0x50
22712-Jan 14 14:08:44 brix kernel: [  360.548939]  [<ffffffff817a2865>] __wait_on_bit+0x65/0x90
22713-Jan 14 14:08:44 brix kernel: [  360.548942]  [<ffffffff811731eb>] ? find_get_pages_tag+0xcb/0x170
22714-Jan 14 14:08:44 brix kernel: [  360.548944]  [<ffffffff81172637>] wait_on_page_bit+0xc7/0xd0
22715-Jan 14 14:08:44 brix kernel: [  360.548947]  [<ffffffff810b3fd0>] ? wake_atomic_t_function+0x40/0x40
22716-Jan 14 14:08:44 brix kernel: [  360.548949]  [<ffffffff81172804>] filemap_fdatawait_range+0xf4/0x180
22717-Jan 14 14:08:44 brix kernel: [  360.548951]  [<ffffffff811747fd>] filemap_write_and_wait_range+0x4d/0x80
22718-Jan 14 14:08:44 brix kernel: [  360.548969]  [<ffffffffc01f9223>] nfs_file_fsync+0x53/0x150 [nfs]
22719-Jan 14 14:08:44 brix kernel: [  360.548974]  [<ffffffff81219899>] vfs_fsync+0x29/0x40
22720-Jan 14 14:08:44 brix kernel: [  360.548980]  [<ffffffffc01f9cfa>] nfs_file_flush+0x8a/0xd0 [nfs]
22721-Jan 14 14:08:44 brix kernel: [  360.548982]  [<ffffffff811e743a>] filp_close+0x3a/0x90
22722-Jan 14 14:08:44 brix kernel: [  360.548984]  [<ffffffff8120709f>] __close_fd+0x8f/0xc0
22723-Jan 14 14:08:44 brix kernel: [  360.548986]  [<ffffffff811e8cd3>] SyS_close+0x23/0x50
22724-Jan 14 14:08:44 brix kernel: [  360.548988]  [<ffffffff817a656d>] system_call_fastpath+0x1a/0x1f

After reading: http://art.ubuntuforums.org/showthread.php?t=1478413 this
is REALLY embarrassing.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/585657

Title:
  Transfering large files to nfs mount causes system freeze

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Lucid:
  Fix Released
Status in linux source package in Maverick:
  Fix Released
Status in linux source package in Natty:
  Fix Released
Status in linux source package in Hardy:
  Fix Released

Bug description:
  Binary package hint: nfs-kernel-server

  I have verified this bug on both karmic and lucid on both the server
  and client:

  -------------------------------------------------------------------------------

  Description:	Ubuntu 9.10
  Release:	9.10

  nfs-common:
    Installed: 1:1.2.0-2ubuntu8

  nfs-kernel-server:
    Installed: 1:1.2.0-2ubuntu8

  portmap:
    Installed: 6.0-10ubuntu2

  -------------------------------------------------------------------------------

  Description:	Ubuntu 10.04 LTS
  Release:	10.04

  nfs-common:
    Installed: 1:1.2.0-4ubuntu4

  nfs-kernel-server:
    Installed: 1:1.2.0-4ubuntu4

  portmap:
    Installed: 6.0.0-1ubuntu2

  -------------------------------------------------------------------------------

  Expected behavior:

  Copying large files from local directories to an nfs mounted directory
  should complete without error.

  -------------------------------------------------------------------------------

  Actual behavior:

  The system freezes while trying to copy large files from a local
  directory (e.g. /tmp) to an nfs mounted directory. This causes various
  things to fail to respond, ultimately resulting in a hard reboot and
  potential loss of data. When this occurs I am able to log into the box
  via ssh, but even sudo is unable to kill -9 the wayward file copy or
  reboot the machine gracefully.

  -------------------------------------------------------------------------------

  Details:

  The server exports several directories, for example:

  /home/shared
  /home/user1/Documents
  /home/user1/Development

  The client mounts these as follows:

  server1:/home/shared    /home/shared    nfs rw,soft,intr 0 0
  server1:/home/user1/Development /home/server1/user1/Development nfs rw,soft,intr 0 0
  server1:/home/user1/Documents   /home/server1/user1/Documents   nfs rw,soft,intr 0 0

  I see lots of messages like this in /var/log/syslog:

  May 22 10:44:31 client1 kernel: [ 1680.390484] INFO: task cp:2791 blocked for more than 120 seconds.
  May 22 10:44:31 client1 kernel: [ 1680.390488] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  May 22 10:44:31 client1 kernel: [ 1680.390492] cp D 00000000ffffffff 0 2791 2503 0x00000000
  May 22 10:44:31 client1 kernel: [ 1680.390501] ffff88012a457c48 0000000000000082 0000000000015bc0 0000000000015bc0
  May 22 10:44:31 client1 kernel: [ 1680.390508] ffff8801291331a0 ffff88012a457fd8 0000000000015bc0 ffff880129132de0
  May 22 10:44:31 client1 kernel: [ 1680.390516] 0000000000015bc0 ffff88012a457fd8 0000000000015bc0 ffff8801291331a0
  May 22 10:44:31 client1 kernel: [ 1680.390523] Call Trace:
  May 22 10:44:31 client1 kernel: [ 1680.390545] [<ffffffffa0cff2b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390552] [<ffffffff8153eb87>] io_schedule+0x47/0x70
  May 22 10:44:31 client1 kernel: [ 1680.390573] [<ffffffffa0cff2be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390579] [<ffffffff8153f3df>] __wait_on_bit+0x5f/0x90
  May 22 10:44:31 client1 kernel: [ 1680.390587] [<ffffffff812b6234>] ? __lookup_tag+0x64/0x120
  May 22 10:44:31 client1 kernel: [ 1680.390608] [<ffffffffa0cff2b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390615] [<ffffffff8153f488>] out_of_line_wait_on_bit+0x78/0x90
  May 22 10:44:31 client1 kernel: [ 1680.390622] [<ffffffff81085360>] ? wake_bit_function+0x0/0x40
  May 22 10:44:31 client1 kernel: [ 1680.390643] [<ffffffffa0cff29f>] nfs_wait_on_request+0x2f/0x40 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390665] [<ffffffffa0d036af>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390688] [<ffffffffa0d04aee>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390711] [<ffffffffa0d04ed9>] nfs_write_mapping+0x79/0xb0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390733] [<ffffffffa0d04f47>] nfs_wb_all+0x17/0x20 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390751] [<ffffffffa0cf3eba>] nfs_do_fsync+0x2a/0x60 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390770] [<ffffffffa0cf4105>] nfs_file_flush+0x75/0xa0 [nfs]
  May 22 10:44:31 client1 kernel: [ 1680.390777] [<ffffffff8114051c>] filp_close+0x3c/0x90
  May 22 10:44:31 client1 kernel: [ 1680.390783] [<ffffffff81140627>] sys_close+0xb7/0x120
  May 22 10:44:31 client1 kernel: [ 1680.390790] [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/585657/+subscriptions