← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1602524] Re: [LTC-Test] - NMI watchdog Bug and call traces when trinity is executed.

 

https://lists.ubuntu.com/archives/kernel-team/2016-July/079167.html

** Also affects: linux (Ubuntu Yakkety)
   Importance: High
     Assignee: Canonical Kernel Team (canonical-kernel-team)
       Status: Triaged

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Yakkety)
       Status: Triaged => Fix Released

** Changed in: linux (Ubuntu Xenial)
       Status: New => In Progress

** Changed in: linux (Ubuntu Xenial)
     Assignee: (unassigned) => Tim Gardner (timg-tpi)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1602524

Title:
  [LTC-Test] - NMI watchdog Bug and call traces when trinity is
  executed.

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  Fix Released

Bug description:
  == Comment: #0 - Santhosh G  ==
  Problem Statement:
  NMI watchdog bug and call traces occurs when trinity is executed.

  Environment:
  P8 PowerVM Lpar

  uname o/p:
  uname -a
  Linux tuleta4u-lp5 4.4.0-11-generic #26-Ubuntu SMP Sat Mar 5 14:21:51 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

  Steps to reproduce:

  1) Install ubuntu 16.04 in a PowerVM LPAR.
  2) Download trinity-1.5 and set up ./configure.sh;make;make install
  3)Execute trinity as 
     './trinity --dangerous'

  The test runs for more than one hour and trinity gets killed with call
  traces:

  [19744.229979] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 21s! [trinity-c3:26544]
  [19744.229991] Modules linked in: hidp hid bnep rfcomm l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel af_key mpls_router llc2 nfnetlink dn_rtmsg xfrm_user xfrm_algo can_raw crypto_user can_bcm cmtp kernelcapi scsi_transport_iscsi sctp libcrc32c nfc af_alg caif_socket caif phonet af_rxrpc bluetooth can pppoe pppox irda crc_ccitt atm appletalk ipx p8023 p8022 psnap llc pseries_rng rtc_generic autofs4 ibmvscsi ibmveth
  [19744.230024] CPU: 3 PID: 26544 Comm: trinity-c3 Not tainted 4.4.0-11-generic #26-Ubuntu
  [19744.230026] task: c00000000ae87e60 ti: c00000000ae24000 task.ti: c00000000ae24000
  [19744.230028] NIP: c0000000003fac78 LR: c0000000003fabfc CTR: c00000000039ef10
  [19744.230029] REGS: c00000000ae27980 TRAP: 0901   Not tainted  (4.4.0-11-generic)
  [19744.230030] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004444  XER: 20000000
  [19744.230035] CFAR: c0000000003fae6c SOFTE: 1
                 GPR00: c0000000003fabfc c00000000ae27c00 c0000000015a3b00 c0000000f7f03ba8
                 GPR04: 000000000e02adcb c00000000ae27cb0 0000000000000000 0000000000000000
                 GPR08: 8000000000000000 0000000000000000 c0000000ef886000 c000000000af0870
                 GPR12: 0000000024004444 c00000000e7f1c80
  [19744.230045] NIP [c0000000003fac78] ext4_es_lookup_extent+0xc8/0x2c0
  [19744.230047] LR [c0000000003fabfc] ext4_es_lookup_extent+0x4c/0x2c0
  [19744.230048] Call Trace:
  [19744.230050] [c00000000ae27c00] [c0000000003fabfc] ext4_es_lookup_extent+0x4c/0x2c0 (unreliable)
  [19744.230053] [c00000000ae27c50] [c0000000003a6f18] ext4_map_blocks+0x78/0x610
  [19744.230055] [c00000000ae27d10] [c00000000039f14c] ext4_llseek+0x23c/0x3f0
  [19744.230057] [c00000000ae27de0] [c0000000002e02a8] SyS_lseek+0xe8/0x130
  [19744.230060] [c00000000ae27e30] [c000000000009204] system_call+0x38/0xb4
  [19744.230061] Instruction dump:
  [19744.230062] 2fa90000 409effec e93e0028 3b800000 e9490458 e92a0440 39290001 f92a0440
  [19744.230065] 7c2004ac 7d20d828 3129ffff 7d20d92d <40c2fff4> 60000000 7f83e378 38210050

  
  == Comment: #8 - Santhosh G  ==

  Tried the scenario as given in https://bugzilla.linux.ibm.com/show_bug.cgi?id=128126#c26
  -----
  # Create a 624GiB file; Mostly filled with holes though
  $ dd if=/dev/zero of=file-0.bin bs=1M count=1 seek=598382 
  # Invoke lseek with SEEK_DATA option starting with file offset 0
  while [ 1 ]; do xfs_io -f -c "seek -d 0" file-0.bin; done
  ----
  and I was able to hit the issue in 16.04.1 

  kernel version:
  4.4.0-28-generic

  dmesg o/p:

  [ 1197.994822] 	40-...: (5249 ticks this GP) idle=975/140000000000001/0 softirq=7812/7812 fqs=5251 
  [ 1197.995071] 	 (t=5251 jiffies g=29144 c=29143 q=3418)
  [ 1197.995115] Task dump for CPU 40:
  [ 1197.995117] xfs_io          R  running task        0  3601   3489 0x00040004
  [ 1197.995121] Call Trace:
  [ 1197.995126] [c000003c7c8675b0] [c0000000000fbc00] sched_show_task+0xe0/0x180 (unreliable)
  [ 1197.995131] [c000003c7c867620] [c00000000013eb74] rcu_dump_cpu_stacks+0xe4/0x150
  [ 1197.995134] [c000003c7c867670] [c0000000001442a4] rcu_check_callbacks+0x6b4/0x9b0
  [ 1197.995136] [c000003c7c8677a0] [c00000000014c108] update_process_times+0x58/0xa0
  [ 1197.995140] [c000003c7c8677d0] [c000000000163818] tick_sched_handle.isra.6+0x48/0xe0
  [ 1197.995143] [c000003c7c867810] [c000000000163914] tick_sched_timer+0x64/0xd0
  [ 1197.995146] [c000003c7c867850] [c00000000014cbd4] __hrtimer_run_queues+0x124/0x450
  [ 1197.995148] [c000003c7c8678e0] [c00000000014dbfc] hrtimer_interrupt+0xec/0x2c0
  [ 1197.995152] [c000003c7c8679a0] [c00000000001f5bc] __timer_interrupt+0x8c/0x290
  [ 1197.995154] [c000003c7c8679f0] [c00000000001f970] timer_interrupt+0xa0/0xe0
  [ 1197.995157] [c000003c7c867a20] [c000000000002714] decrementer_common+0x114/0x180
  [ 1197.995163] --- interrupt: 901 at ext4_es_find_delayed_extent_range+0x20/0x2b0
                     LR = ext4_llseek+0x268/0x3f0
  [ 1197.995166] [c000003c7c867d10] [c0000000003a170c] ext4_llseek+0x23c/0x3f0 (unreliable)
  [ 1197.995170] [c000003c7c867de0] [c0000000002e1f08] SyS_lseek+0xe8/0x130
  [ 1197.995173] [c000003c7c867e30] [c000000000009204] system_call+0x38/0xb4

  =====

  The call traces does not occur when tried with the kernel with patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1602524/+subscriptions