← Back to team overview

kernel-packages team mailing list archive

[Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount

 

** No longer affects: linux (Ubuntu)

** No longer affects: linux (Ubuntu Trusty)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1509120

Title:
  Process accounting deadlock with idmapd callout when writing to NFSv4
  mount

Status in nfs-utils package in Ubuntu:
  Fix Released
Status in nfs-utils source package in Trusty:
  Fix Committed

Bug description:
  [Impact]

   * Programs accessing nfsv4 mounts will hang on request_key interface
  with nfs4 + sec=sys with old nfsv4 hosts.  Kernel is waiting on
  usermodehelper provided by keyutils.

   * INFO: task ls:2101 blocked for more than 120 seconds.
        Not tainted 3.13.0-66-generic #108-Ubuntu
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  ls D ffff88007fd13180 0 2101 1215 0x00000004
   ffff88007b14d630 0000000000000086 ffff8800374e6000 ffff88007b14dfd8
   0000000000013180 0000000000013180 ffff8800374e6000 ffff88007b14d6b0
   ffff88007ffd1460 0000000000000002 ffffffff812d0ce0 ffff88007b14d6a0
  Call Trace:
   [<ffffffff812d0ce0>] ? umh_keys_init+0x20/0x20
   [<ffffffff81728499>] schedule+0x29/0x70
   [<ffffffff812d0cee>] key_wait_bit+0xe/0x20
   [<ffffffff81728c42>] __wait_on_bit+0x62/0x90
   [<ffffffff812d0ce0>] ? umh_keys_init+0x20/0x20
   [<ffffffff81728ce7>] out_of_line_wait_on_bit+0x77/0x90
   [<ffffffff810ab3d0>] ? autoremove_wake_function+0x40/0x40
   [<ffffffff812d10be>] wait_for_key_construction+0x6e/0x80
   [<ffffffff812d160c>] request_key+0x5c/0xa0
   [<ffffffffa027858f>] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4]
   [<ffffffffa0278f8f>] nfs_map_name_to_uid+0xef/0x150 [nfsv4]
   [<ffffffffa0270117>] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4]
   [<ffffffff8101bc79>] ? sched_clock+0x9/0x10
   [<ffffffffa027080c>] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4]
   [<ffffffffa0270ef0>] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4]
   [<ffffffffa0270f60>] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4]
   [<ffffffffa013e316>] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc]
   [<ffffffffa0270ef0>] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4]
   [<ffffffffa0130f6f>] call_decode+0x1df/0x870 [sunrpc]
   [<ffffffffa0130d90>] ? call_refreshresult+0x170/0x170 [sunrpc]
   [<ffffffffa0130d90>] ? call_refreshresult+0x170/0x170 [sunrpc]
   [<ffffffffa013bd84>] __rpc_execute+0x84/0x400 [sunrpc]
   [<ffffffffa013ccfe>] rpc_execute+0x5e/0xa0 [sunrpc]
   [<ffffffffa01331d0>] rpc_run_task+0x70/0x90 [sunrpc]
   [<ffffffffa0259646>] nfs4_call_sync_sequence+0x56/0x80 [nfsv4]
   [<ffffffffa0259f2e>] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4]
   [<ffffffffa02604ea>] nfs4_proc_getattr+0x5a/0xd0 [nfsv4]
   [<ffffffffa01e19df>] __nfs_revalidate_inode+0xbf/0x310 [nfs]
   [<ffffffffa01d9af3>] nfs_opendir+0xe3/0x100 [nfs]
   [<ffffffff811bb883>] do_dentry_open+0x233/0x2e0
   [<ffffffffa01d9a10>] ? nfs_readdir_clear_array+0x70/0x70 [nfs]
   [<ffffffff811bbbb9>] vfs_open+0x49/0x50
   [<ffffffff811ccf64>] do_last+0x564/0x1240
   [<ffffffff811cac06>] ? link_path_walk+0x256/0x880
   [<ffffffff8131615b>] ? apparmor_file_alloc_security+0x5b/0x180
   [<ffffffff812d8786>] ? security_file_alloc+0x16/0x20
   [<ffffffff811cdcfb>] path_openat+0xbb/0x650
   [<ffffffff811cf0fa>] do_filp_open+0x3a/0x90
   [<ffffffff8118199e>] ? do_mmap_pgoff+0x34e/0x3d0
   [<ffffffff811dbf77>] ? __alloc_fd+0xa7/0x130
   [<ffffffff811bd6d9>] do_sys_open+0x129/0x280
   [<ffffffff817305ba>] ? do_page_fault+0x1a/0x70
   [<ffffffff811bd864>] SyS_openat+0x14/0x20
   [<ffffffff81734c5d>] system_call_fastpath+0x1a/0x1f

  [Test Case]

  1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old.
  2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports
  3) sudo exportfs -a && sudo service nfs-kernel-server restart

  Client:
  1) sudo apt-get install nfs-common acct
  2) sudo mkdir /account
  3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none  10.245.80.76:/export /account
   4) sudo touch /account/pacct
   5) sudo accton /account/pacct
   6) cd /account/
   7) run ls and other operations. *(dpkg operations seemed to do it for me)
   8) wait
   8) Terminal will become unresponsive within a few minutes, and new logins will not be possible

  [Regression Potential]

   * There are a total of 28 other references to request_key in the kernel. It is possible that previous failures to request_key may now be passing which may result in alternate code paths being taken. Those kernel subsystems that don't already have an explicit obvious dependency on keyutils are.
     * drivers/staging/lustre
     * fs/afs
     * fs/fscache
     * fs/nfs
     * lib/digsig.c
     * net/ceph
     * net/dns_resolver
     * net/rxrpc
     * security/integrity
     * security/keys
   * Minimal this was applied to vivid+wily via bug 1449074.

  [Other Info]

   * Sulution is to add keyutils as Required to nfs-common.

  Original Description
  __________________________________________________________________________

  ---Problem Description---
  System hang when process accounting to a NFSv4 mount.

  ---uname output---
  Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52
  UTC 2015 ppc64le ppc64le ppc64le GNU/Linux

  ---Problem Details---

  We have a customer that is experiencing intermittent system hangs on
  their system.  After a bit of debug, it was discovered that the
  trigger was turning on process accounting and writing to a file hosted
  via an NFSv4 mount.  During the testing, several vmcores were
  captured, and the fingerprint indicates a mutex deadlock situation
  with process accounting.   In the most recent vmcore, it appears that
  the scenario is something like the following:

  1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file.
  2. The resulting NFS write needs idmapd information and calls out to idmapd
  3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898.

  PID: 4898   TASK: c000001fd26d7580  CPU: 7   COMMAND: "ls"
   #0 [c000001fd274a950] __switch_to at c000000000015934
   #1 [c000001fd274ab20] __switch_to at c000000000015934
   #2 [c000001fd274ab80] __schedule at c000000000a11de8
   #3 [c000001fd274ada0] schedule_timeout at c000000000a16284
   #4 [c000001fd274ae90] wait_for_common at c000000000a1360c
   #5 [c000001fd274af10] call_usermodehelper_exec at c0000000000ccd38
   #6 [c000001fd274af70] call_sbin_request_key at c000000000429258
   #7 [c000001fd274b100] request_key_and_link at c00000000042983c
   #8 [c000001fd274b200] request_key at c000000000429978
   #9 [c000001fd274b240] nfs_idmap_get_key at d00000002ca8b0bc [nfsv4]
  #10 [c000001fd274b2b0] nfs_map_name_to_uid at d00000002ca8bbd0 [nfsv4]
  #11 [c000001fd274b320] decode_getfattr_attrs at d00000002ca7f59c [nfsv4]
  #12 [c000001fd274b420] decode_getfattr_generic.constprop.96 at d00000002ca7fd78
  [nfsv4]
  #13 [c000001fd274b4d0] nfs4_xdr_dec_getattr at d00000002ca80738 [nfsv4]
  #14 [c000001fd274b530] rpcauth_unwrap_resp at d00000001fe67180 [sunrpc]
  #15 [c000001fd274b600] call_decode at d00000001fe527c8 [sunrpc]
  #16 [c000001fd274b6b0] __rpc_execute at d00000001fe64260 [sunrpc]
  #17 [c000001fd274b790] rpc_run_task at d00000001fe54a78 [sunrpc]
  #18 [c000001fd274b7c0] nfs4_call_sync_sequence at d00000002ca60960 [nfsv4]
  #19 [c000001fd274b860] _nfs4_proc_getattr at d00000002ca6217c [nfsv4]
  #20 [c000001fd274b930] nfs4_proc_getattr at d00000002ca6f494 [nfsv4]
  #21 [c000001fd274b9a0] __nfs_revalidate_inode at d0000000202cf614 [nfs]
  #22 [c000001fd274ba30] nfs_revalidate_file_size at d0000000202c9618 [nfs]
  #23 [c000001fd274ba70] nfs_file_write at d0000000202cabdc [nfs]
  #24 [c000001fd274bb00] new_sync_write at c0000000002b3d9c
  #25 [c000001fd274bbd0] __kernel_write at c0000000002b3fec
  #26 [c000001fd274bc20] do_acct_process at c000000000166b78
  #27 [c000001fd274bcc0] acct_process at c00000000016748c
  #28 [c000001fd274bcf0] do_exit at c0000000000b3660
  #29 [c000001fd274bdc0] do_group_exit at c0000000000b3b14
  #30 [c000001fd274be00] sys_exit_group at c0000000000b3bdc
  #31 [c000001fd274be30] system_call at c000000000009258

  PID: 4900   TASK: c000003c9946c180  CPU: 16  COMMAND: "kworker/u320:2"
   #0 [c000003c994fb790] __switch_to at c000000000015934
   #1 [c000003c994fb960] __switch_to at c000000000015934
   #2 [c000003c994fb9c0] __schedule at c000000000a11de8
   #3 [c000003c994fbbe0] schedule_preempt_disabled at c000000000a12980
   #4 [c000003c994fbc00] __mutex_lock_slowpath at c000000000a14aec
   #5 [c000003c994fbc80] mutex_lock at c000000000a14c4c
   #6 [c000003c994fbcb0] acct_get at c0000000001663ec
   #7 [c000003c994fbcf0] acct_process at c000000000167480
   #8 [c000003c994fbd20] do_exit at c0000000000b3660
   #9 [c000003c994fbdf0] ____call_usermodehelper at c0000000000ccaf4
  #10 [c000003c994fbe30] ret_from_kernel_thread at c00000000000956c

  Historical bug data:

  from customer:

   am uploading a crash dump file from a lock up event that I just had.
  I was reminded on a status update call this morning that I never tried
  running process accounting since opening the ticket and updating the
  kernel. So, I tried that this morning. The first time I turned it on
  with the default output location and didn?t have any problems. Then I
  tried turning it on with the output going to our shared disk space,
  which is where I was originally sending it. I ran a couple commands
  that returned just fine, then when I ran a CUDA test program it didn?t
  return and I verified that I was no longer able to login to the node.
  I let it sit for awhile and eventually the console spit out some hung
  process errors. I waited a little longer, but then went ahead and hit
  the key combo to force a dump. I will do some more testing, but right
  now it looks like the problem is running Linux process account
  (accton) with the output directed at an NFS mount. This is the output
  of /proc/mounts for the mount point that I was pointing the output:

  172.17.0.1:/gpfs/sb /gpfs/sb nfs4
  rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.17.12.1,local_lock=none,addr=172.17.0.1
  0 0

  The file system is GPFS 3.5 on the back end, but the client is
  mounting it via NFS. I will try to reproduce the problem on a disk
  that is local to the NFS server just to take the GPFS part out of the
  equation. Our x86 clients are all doing this to directly mounted GPFS
  volumes so I doubt that is the issue.

  Let me know if there is anything else you?d like to see or have me
  try.

  Mike

  == Comment: #21  - 2015-10-19 14:35:35 ==
  I have repeated the problem and narrowed it down to NFS v4. I used an existing NFS export and mounted it with the default options which used NFS v3 and I was unable to get the problem to happen. I then setup an NFS v4 export, mounted that, turned on process accounting to write to that mount point, did an ls. The ls returned data, but the command prompt never returned. After some time I got the standard hung_task error.

  Mike

  == Comment: #27 - 2015-10-20 15:57:47 ==
  Kevin,
  1. The server is running CentOS 6.x (Most packages are from 6.6).
  The kernel is 2.6.32-358.18.1.el6.x86_64

  2. This is the /etc/exports line for this directory:
  /var/psacct 172.17.0.0/16(rw,no_root_squash,sync,fsid=299)

  3. The base FS is ext4. This is its /proc/mounts entry:
  /dev/md124 /var ext4 rw,relatime,barrier=1,data=ordered 0 0

  4. The clients /etc/fstab looks like this:
  172.17.0.1:/var/psacct/ /mnt    nfs     rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none      0       0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1509120/+subscriptions