group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #06214
[Bug 1602524] Re: [LTC-Test] - NMI watchdog Bug and call traces when trinity is executed.
https://lists.ubuntu.com/archives/kernel-team/2016-July/079167.html
** Also affects: linux (Ubuntu Yakkety)
Importance: High
Assignee: Canonical Kernel Team (canonical-kernel-team)
Status: Triaged
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Yakkety)
Status: Triaged => Fix Released
** Changed in: linux (Ubuntu Xenial)
Status: New => In Progress
** Changed in: linux (Ubuntu Xenial)
Assignee: (unassigned) => Tim Gardner (timg-tpi)
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1602524
Title:
[LTC-Test] - NMI watchdog Bug and call traces when trinity is
executed.
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Xenial:
In Progress
Status in linux source package in Yakkety:
Fix Released
Bug description:
== Comment: #0 - Santhosh G ==
Problem Statement:
NMI watchdog bug and call traces occurs when trinity is executed.
Environment:
P8 PowerVM Lpar
uname o/p:
uname -a
Linux tuleta4u-lp5 4.4.0-11-generic #26-Ubuntu SMP Sat Mar 5 14:21:51 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
Steps to reproduce:
1) Install ubuntu 16.04 in a PowerVM LPAR.
2) Download trinity-1.5 and set up ./configure.sh;make;make install
3)Execute trinity as
'./trinity --dangerous'
The test runs for more than one hour and trinity gets killed with call
traces:
[19744.229979] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 21s! [trinity-c3:26544]
[19744.229991] Modules linked in: hidp hid bnep rfcomm l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel af_key mpls_router llc2 nfnetlink dn_rtmsg xfrm_user xfrm_algo can_raw crypto_user can_bcm cmtp kernelcapi scsi_transport_iscsi sctp libcrc32c nfc af_alg caif_socket caif phonet af_rxrpc bluetooth can pppoe pppox irda crc_ccitt atm appletalk ipx p8023 p8022 psnap llc pseries_rng rtc_generic autofs4 ibmvscsi ibmveth
[19744.230024] CPU: 3 PID: 26544 Comm: trinity-c3 Not tainted 4.4.0-11-generic #26-Ubuntu
[19744.230026] task: c00000000ae87e60 ti: c00000000ae24000 task.ti: c00000000ae24000
[19744.230028] NIP: c0000000003fac78 LR: c0000000003fabfc CTR: c00000000039ef10
[19744.230029] REGS: c00000000ae27980 TRAP: 0901 Not tainted (4.4.0-11-generic)
[19744.230030] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004444 XER: 20000000
[19744.230035] CFAR: c0000000003fae6c SOFTE: 1
GPR00: c0000000003fabfc c00000000ae27c00 c0000000015a3b00 c0000000f7f03ba8
GPR04: 000000000e02adcb c00000000ae27cb0 0000000000000000 0000000000000000
GPR08: 8000000000000000 0000000000000000 c0000000ef886000 c000000000af0870
GPR12: 0000000024004444 c00000000e7f1c80
[19744.230045] NIP [c0000000003fac78] ext4_es_lookup_extent+0xc8/0x2c0
[19744.230047] LR [c0000000003fabfc] ext4_es_lookup_extent+0x4c/0x2c0
[19744.230048] Call Trace:
[19744.230050] [c00000000ae27c00] [c0000000003fabfc] ext4_es_lookup_extent+0x4c/0x2c0 (unreliable)
[19744.230053] [c00000000ae27c50] [c0000000003a6f18] ext4_map_blocks+0x78/0x610
[19744.230055] [c00000000ae27d10] [c00000000039f14c] ext4_llseek+0x23c/0x3f0
[19744.230057] [c00000000ae27de0] [c0000000002e02a8] SyS_lseek+0xe8/0x130
[19744.230060] [c00000000ae27e30] [c000000000009204] system_call+0x38/0xb4
[19744.230061] Instruction dump:
[19744.230062] 2fa90000 409effec e93e0028 3b800000 e9490458 e92a0440 39290001 f92a0440
[19744.230065] 7c2004ac 7d20d828 3129ffff 7d20d92d <40c2fff4> 60000000 7f83e378 38210050
== Comment: #8 - Santhosh G ==
Tried the scenario as given in https://bugzilla.linux.ibm.com/show_bug.cgi?id=128126#c26
-----
# Create a 624GiB file; Mostly filled with holes though
$ dd if=/dev/zero of=file-0.bin bs=1M count=1 seek=598382
# Invoke lseek with SEEK_DATA option starting with file offset 0
while [ 1 ]; do xfs_io -f -c "seek -d 0" file-0.bin; done
----
and I was able to hit the issue in 16.04.1
kernel version:
4.4.0-28-generic
dmesg o/p:
[ 1197.994822] 40-...: (5249 ticks this GP) idle=975/140000000000001/0 softirq=7812/7812 fqs=5251
[ 1197.995071] (t=5251 jiffies g=29144 c=29143 q=3418)
[ 1197.995115] Task dump for CPU 40:
[ 1197.995117] xfs_io R running task 0 3601 3489 0x00040004
[ 1197.995121] Call Trace:
[ 1197.995126] [c000003c7c8675b0] [c0000000000fbc00] sched_show_task+0xe0/0x180 (unreliable)
[ 1197.995131] [c000003c7c867620] [c00000000013eb74] rcu_dump_cpu_stacks+0xe4/0x150
[ 1197.995134] [c000003c7c867670] [c0000000001442a4] rcu_check_callbacks+0x6b4/0x9b0
[ 1197.995136] [c000003c7c8677a0] [c00000000014c108] update_process_times+0x58/0xa0
[ 1197.995140] [c000003c7c8677d0] [c000000000163818] tick_sched_handle.isra.6+0x48/0xe0
[ 1197.995143] [c000003c7c867810] [c000000000163914] tick_sched_timer+0x64/0xd0
[ 1197.995146] [c000003c7c867850] [c00000000014cbd4] __hrtimer_run_queues+0x124/0x450
[ 1197.995148] [c000003c7c8678e0] [c00000000014dbfc] hrtimer_interrupt+0xec/0x2c0
[ 1197.995152] [c000003c7c8679a0] [c00000000001f5bc] __timer_interrupt+0x8c/0x290
[ 1197.995154] [c000003c7c8679f0] [c00000000001f970] timer_interrupt+0xa0/0xe0
[ 1197.995157] [c000003c7c867a20] [c000000000002714] decrementer_common+0x114/0x180
[ 1197.995163] --- interrupt: 901 at ext4_es_find_delayed_extent_range+0x20/0x2b0
LR = ext4_llseek+0x268/0x3f0
[ 1197.995166] [c000003c7c867d10] [c0000000003a170c] ext4_llseek+0x23c/0x3f0 (unreliable)
[ 1197.995170] [c000003c7c867de0] [c0000000002e1f08] SyS_lseek+0xe8/0x130
[ 1197.995173] [c000003c7c867e30] [c000000000009204] system_call+0x38/0xb4
=====
The call traces does not occur when tried with the kernel with patch.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1602524/+subscriptions