kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #108444
[Bug 561210] Re: Writing big files to NFS target causes system lock up
Having same problem on
Client: Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-69-generic-pae i686)
Server: Ubuntu 12.04.5 LTS (GNU/Linux 3.2.0-77-generic x86_64)
This is frighteningly bad.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/561210
Title:
Writing big files to NFS target causes system lock up
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Lucid:
Fix Released
Status in linux source package in Maverick:
Fix Released
Status in linux source package in Natty:
Fix Released
Bug description:
I'm experiencing complete system lock ups occasionally when writing
big files to an NFS target.
I can't remember that I had such issues with Jaunty, but at least
Karmic and Lucid are affected in the exact same way.
In most cases just the "mv" or "cp" process hangs for a while, but
sometimes the whole system freezes, including X.
Although some things keep running (e.g. the clock on my G15 Keyboard
LCD display doesn't freeze, so the g15daemon keeps running as usual),
I'm unable to get the system back to a normal state. Even waiting a
whole night for the NFS task to finish doesn't succeed. In such a case
the NFS server side is completely idle, not receiving any data from
the client. I have to hard reset the client machine to get a working
system again.
The server is a Busybox NAS system with the following exports options:
rw,no_wdelay,no_root_squash,insecure_locks,no_subtree_check
The clients are mounting with these options:
rw,rsize=32768,wsize=32768,hard,intr,noatime
All machines are connected through a LevelOne GSW-0803T 8-port GBit
switch with Cat6e cables. No wireless here. No other issues with
networking here. The client machines are both AMD64 AMD Quad-Cores,
both have GigaByte mainboards using the internal GBit Ethernet
connector for networking.
Some time ago I also used a Dell Inspiron (32-bit Pentium-M system)
notebook with a 100MBit connection on this network without such
issues.
The following gets logged during one of those lock-ups:
INFO: task mv:26028 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mv D 00000000ffffffff 0 26028 2040 0x00000000
ffff880100198f08 0000000000000086 0000000000015b80 0000000000015b80
ffff880001dc83c0 ffff880100199fd8 0000000000015b80 ffff880001dc8000
0000000000015b80 ffff880100199fd8 0000000000015b80 ffff880001dc83c0
Call Trace:
[<ffffffffa0418280>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[<ffffffff8153e697>] io_schedule+0x47/0x70
[<ffffffffa041828e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
[<ffffffff8153eeef>] __wait_on_bit+0x5f/0x90
[<ffffffff81013cae>] ? apic_timer_interrupt+0xe/0x20
[<ffffffffa0418280>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[<ffffffff8153ef98>] out_of_line_wait_on_bit+0x78/0x90
[<ffffffff81085340>] ? wake_bit_function+0x0/0x40
[<ffffffffa041826f>] nfs_wait_on_request+0x2f/0x40 [nfs]
[<ffffffffa041c66f>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
[<ffffffffa041daae>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
[<ffffffffa041dc31>] nfs_wb_page+0x81/0xe0 [nfs]
[<ffffffffa040cb17>] nfs_release_page+0x57/0x70 [nfs]
[<ffffffff810f2a52>] try_to_release_page+0x32/0x50
[<ffffffff811016a3>] shrink_page_list+0x453/0x5f0
[<ffffffff81101b4d>] shrink_inactive_list+0x30d/0x7e0
[<ffffffff810fbcda>] ? determine_dirtyable_memory+0x1a/0x30
[<ffffffff810fbd87>] ? get_dirty_limits+0x27/0x2f0
[<ffffffff811020b1>] shrink_list+0x91/0xf0
[<ffffffff811022a7>] shrink_zone+0x197/0x240
[<ffffffff811023c2>] shrink_zones+0x72/0x100
[<ffffffff811024ce>] do_try_to_free_pages+0x7e/0x330
[<ffffffff8110287f>] try_to_free_pages+0x6f/0x80
[<ffffffff811003c0>] ? isolate_pages_global+0x0/0x50
[<ffffffff810f992a>] __alloc_pages_slowpath+0x27a/0x580
[<ffffffff810f9d8e>] __alloc_pages_nodemask+0x15e/0x1a0
[<ffffffff8112cc07>] alloc_pages_current+0x87/0xd0
[<ffffffff81132728>] new_slab+0x248/0x310
[<ffffffff81134fb9>] __slab_alloc+0x169/0x2d0
[<ffffffff810f5ab5>] ? mempool_alloc_slab+0x15/0x20
[<ffffffff811354e4>] kmem_cache_alloc+0xe4/0x150
[<ffffffff810f5ab5>] mempool_alloc_slab+0x15/0x20
[<ffffffff810f5c53>] mempool_alloc+0x63/0x140
[<ffffffff81085300>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa041ce70>] nfs_writedata_alloc+0x20/0xc0 [nfs]
[<ffffffffa041cf32>] nfs_flush_one+0x22/0xf0 [nfs]
[<ffffffffa0418097>] nfs_pageio_doio+0x37/0x80 [nfs]
[<ffffffffa0418134>] nfs_pageio_add_request+0x54/0x100 [nfs]
[<ffffffffa041c47d>] nfs_page_async_flush+0x9d/0xf0 [nfs]
[<ffffffffa041c557>] nfs_do_writepage+0x87/0x90 [nfs]
[<ffffffffa041cc9e>] nfs_writepages_callback+0x1e/0x40 [nfs]
[<ffffffff810fcac7>] write_cache_pages+0x227/0x4d0
[<ffffffffa041cc80>] ? nfs_writepages_callback+0x0/0x40 [nfs]
[<ffffffffa041cc09>] nfs_writepages+0xb9/0x130 [nfs]
[<ffffffffa041cf10>] ? nfs_flush_one+0x0/0xf0 [nfs]
[<ffffffff810fcdc1>] do_writepages+0x21/0x40
[<ffffffff810f40ab>] __filemap_fdatawrite_range+0x5b/0x60
[<ffffffff810f43df>] filemap_fdatawrite+0x1f/0x30
[<ffffffff810f4425>] filemap_write_and_wait+0x35/0x50
[<ffffffffa041055b>] nfs_setattr+0x15b/0x180 [nfs]
[<ffffffff810f4c26>] ? generic_file_aio_read+0xb6/0x1d0
[<ffffffff81013b0e>] ? common_interrupt+0xe/0x13
[<ffffffff810f36de>] ? find_get_page+0x1e/0xa0
[<ffffffff810f50c9>] ? filemap_fault+0xb9/0x460
[<ffffffff8106c4a7>] ? current_fs_time+0x27/0x30
[<ffffffff8115b7db>] notify_change+0x16b/0x350
[<ffffffff8116a16c>] utimes_common+0xdc/0x1b0
[<ffffffff812b6eba>] ? __up_read+0x9a/0xc0
[<ffffffff8116a2e1>] do_utimes+0xa1/0xf0
[<ffffffff81543378>] ? do_page_fault+0x158/0x3b0
[<ffffffff8116a442>] sys_utimensat+0x32/0x90
[<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
Other tasks raising such messages (the same session):
[103440.890116] INFO: task kswapd0:52 blocked for more than 120 seconds.
[103440.890576] INFO: task Xorg:1300 blocked for more than 120 seconds.
[103440.891199] INFO: task plasma-desktop:1987 blocked for more than 120 seconds.
[103440.892078] INFO: task mv:26028 blocked for more than 120 seconds.
[103560.890060] INFO: task kswapd0:52 blocked for more than 120 seconds.
[103560.890516] INFO: task Xorg:1300 blocked for more than 120 seconds.
[103560.891139] INFO: task plasma-desktop:1987 blocked for more than 120 seconds.
[103560.892002] INFO: task mv:26028 blocked for more than 120 seconds.
[103680.890072] INFO: task kswapd0:52 blocked for more than 120 seconds.
[103680.890535] INFO: task Xorg:1300 blocked for more than 120 seconds.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/561210/+subscriptions