← Back to team overview

kernel-packages team mailing list archive

[Bug 561210] Re: Writing big files to NFS target causes system lock up

 

Having same problem on
Client: Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-69-generic-pae i686)
Server: Ubuntu 12.04.5 LTS (GNU/Linux 3.2.0-77-generic x86_64)

This is frighteningly bad.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/561210

Title:
  Writing big files to NFS target causes system lock up

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Lucid:
  Fix Released
Status in linux source package in Maverick:
  Fix Released
Status in linux source package in Natty:
  Fix Released

Bug description:
  I'm experiencing complete system lock ups occasionally when writing
  big files to an NFS target.

  I can't remember that I had such issues with Jaunty, but at least
  Karmic and Lucid are affected in the exact same way.

  In most cases just the "mv" or "cp" process hangs for a while, but
  sometimes the whole system freezes, including X.

  Although some things keep running (e.g. the clock on my G15 Keyboard
  LCD display doesn't freeze, so the g15daemon keeps running as usual),
  I'm unable to get the system back to a normal state. Even waiting a
  whole night for the NFS task to finish doesn't succeed. In such a case
  the NFS server side is completely idle, not receiving any data from
  the client. I have to hard reset the client machine to get a working
  system again.

  The server is a Busybox NAS system with the following exports options:
  rw,no_wdelay,no_root_squash,insecure_locks,no_subtree_check

  The clients are mounting with these options:
  rw,rsize=32768,wsize=32768,hard,intr,noatime

  All machines are connected through a LevelOne GSW-0803T 8-port GBit
  switch with Cat6e cables. No wireless here. No other issues with
  networking here. The client machines are both AMD64 AMD Quad-Cores,
  both have GigaByte mainboards using the internal GBit Ethernet
  connector for networking.

  Some time ago I also used a Dell Inspiron (32-bit Pentium-M system)
  notebook with a 100MBit connection on this network without such
  issues.

  The following gets logged during one of those lock-ups:

  INFO: task mv:26028 blocked for more than 120 seconds.
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  mv            D 00000000ffffffff     0 26028   2040 0x00000000
   ffff880100198f08 0000000000000086 0000000000015b80 0000000000015b80
   ffff880001dc83c0 ffff880100199fd8 0000000000015b80 ffff880001dc8000
   0000000000015b80 ffff880100199fd8 0000000000015b80 ffff880001dc83c0
  Call Trace:
   [<ffffffffa0418280>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
   [<ffffffff8153e697>] io_schedule+0x47/0x70
   [<ffffffffa041828e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
   [<ffffffff8153eeef>] __wait_on_bit+0x5f/0x90
   [<ffffffff81013cae>] ? apic_timer_interrupt+0xe/0x20
   [<ffffffffa0418280>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
   [<ffffffff8153ef98>] out_of_line_wait_on_bit+0x78/0x90
   [<ffffffff81085340>] ? wake_bit_function+0x0/0x40
   [<ffffffffa041826f>] nfs_wait_on_request+0x2f/0x40 [nfs]
   [<ffffffffa041c66f>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
   [<ffffffffa041daae>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
   [<ffffffffa041dc31>] nfs_wb_page+0x81/0xe0 [nfs]
   [<ffffffffa040cb17>] nfs_release_page+0x57/0x70 [nfs]
   [<ffffffff810f2a52>] try_to_release_page+0x32/0x50
   [<ffffffff811016a3>] shrink_page_list+0x453/0x5f0
   [<ffffffff81101b4d>] shrink_inactive_list+0x30d/0x7e0
   [<ffffffff810fbcda>] ? determine_dirtyable_memory+0x1a/0x30
   [<ffffffff810fbd87>] ? get_dirty_limits+0x27/0x2f0
   [<ffffffff811020b1>] shrink_list+0x91/0xf0
   [<ffffffff811022a7>] shrink_zone+0x197/0x240
   [<ffffffff811023c2>] shrink_zones+0x72/0x100
   [<ffffffff811024ce>] do_try_to_free_pages+0x7e/0x330
   [<ffffffff8110287f>] try_to_free_pages+0x6f/0x80
   [<ffffffff811003c0>] ? isolate_pages_global+0x0/0x50
   [<ffffffff810f992a>] __alloc_pages_slowpath+0x27a/0x580
   [<ffffffff810f9d8e>] __alloc_pages_nodemask+0x15e/0x1a0
   [<ffffffff8112cc07>] alloc_pages_current+0x87/0xd0
   [<ffffffff81132728>] new_slab+0x248/0x310
   [<ffffffff81134fb9>] __slab_alloc+0x169/0x2d0
   [<ffffffff810f5ab5>] ? mempool_alloc_slab+0x15/0x20
   [<ffffffff811354e4>] kmem_cache_alloc+0xe4/0x150
   [<ffffffff810f5ab5>] mempool_alloc_slab+0x15/0x20
   [<ffffffff810f5c53>] mempool_alloc+0x63/0x140
   [<ffffffff81085300>] ? autoremove_wake_function+0x0/0x40
   [<ffffffffa041ce70>] nfs_writedata_alloc+0x20/0xc0 [nfs]
   [<ffffffffa041cf32>] nfs_flush_one+0x22/0xf0 [nfs]
   [<ffffffffa0418097>] nfs_pageio_doio+0x37/0x80 [nfs]
   [<ffffffffa0418134>] nfs_pageio_add_request+0x54/0x100 [nfs]
   [<ffffffffa041c47d>] nfs_page_async_flush+0x9d/0xf0 [nfs]
   [<ffffffffa041c557>] nfs_do_writepage+0x87/0x90 [nfs]
   [<ffffffffa041cc9e>] nfs_writepages_callback+0x1e/0x40 [nfs]
   [<ffffffff810fcac7>] write_cache_pages+0x227/0x4d0
   [<ffffffffa041cc80>] ? nfs_writepages_callback+0x0/0x40 [nfs]
   [<ffffffffa041cc09>] nfs_writepages+0xb9/0x130 [nfs]
   [<ffffffffa041cf10>] ? nfs_flush_one+0x0/0xf0 [nfs]
   [<ffffffff810fcdc1>] do_writepages+0x21/0x40
   [<ffffffff810f40ab>] __filemap_fdatawrite_range+0x5b/0x60
   [<ffffffff810f43df>] filemap_fdatawrite+0x1f/0x30
   [<ffffffff810f4425>] filemap_write_and_wait+0x35/0x50
   [<ffffffffa041055b>] nfs_setattr+0x15b/0x180 [nfs]
   [<ffffffff810f4c26>] ? generic_file_aio_read+0xb6/0x1d0
   [<ffffffff81013b0e>] ? common_interrupt+0xe/0x13
   [<ffffffff810f36de>] ? find_get_page+0x1e/0xa0
   [<ffffffff810f50c9>] ? filemap_fault+0xb9/0x460
   [<ffffffff8106c4a7>] ? current_fs_time+0x27/0x30
   [<ffffffff8115b7db>] notify_change+0x16b/0x350
   [<ffffffff8116a16c>] utimes_common+0xdc/0x1b0
   [<ffffffff812b6eba>] ? __up_read+0x9a/0xc0
   [<ffffffff8116a2e1>] do_utimes+0xa1/0xf0
   [<ffffffff81543378>] ? do_page_fault+0x158/0x3b0
   [<ffffffff8116a442>] sys_utimensat+0x32/0x90
   [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

  
  Other tasks raising such messages (the same session):

  [103440.890116] INFO: task kswapd0:52 blocked for more than 120 seconds.
  [103440.890576] INFO: task Xorg:1300 blocked for more than 120 seconds.
  [103440.891199] INFO: task plasma-desktop:1987 blocked for more than 120 seconds.
  [103440.892078] INFO: task mv:26028 blocked for more than 120 seconds.
  [103560.890060] INFO: task kswapd0:52 blocked for more than 120 seconds.
  [103560.890516] INFO: task Xorg:1300 blocked for more than 120 seconds.
  [103560.891139] INFO: task plasma-desktop:1987 blocked for more than 120 seconds.
  [103560.892002] INFO: task mv:26028 blocked for more than 120 seconds.
  [103680.890072] INFO: task kswapd0:52 blocked for more than 120 seconds.
  [103680.890535] INFO: task Xorg:1300 blocked for more than 120 seconds.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/561210/+subscriptions